Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet

Guo, Tuo; Xu, Yunyan; Bi, Yang; Ding, Shaochun; Huang, Yong

doi:10.3390/electronics13112171

Open AccessArticle

Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet

by

Tuo Guo

^1,2,

Yunyan Xu

²,

Yang Bi

^3,*

,

Shaochun Ding

⁴ and

Yong Huang

^4,*

¹

School of Mechanical Engineering, Zhejiang University, Hangzhou 310030, China

²

School of Electronic Information and Artificial Intelligence, Shaanxi University of Science & Technology, Xi’an 710021, China

³

School of Electronic Engineering, Xi’an Aeronautical Institute, Xi’an 710077, China

⁴

Ningbo BoHai ShenHeng Technology Co., Ltd., Ningbo 315048, China

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(11), 2171; https://doi.org/10.3390/electronics13112171

Submission received: 29 April 2024 / Revised: 21 May 2024 / Accepted: 24 May 2024 / Published: 3 June 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In the field of underwater acoustics, the signal-to-noise ratio (SNR) is generally low, and the underwater environment is complex and variable, making target azimuth estimation highly challenging. Traditional model-based subspace methods exhibit significant performance degradation when dealing with coherent sources, low SNR, and small snapshot data. To overcome these limitations, an improved model based on SubspaceNet, called PConv-GAM Residual SubspaceNet (PGR-SubspaceNet), is proposed. This model embeds the global attention mechanism (GAM) into residual blocks that fuse PConv convolution, making it possible to capture richer cross-channel and positional information. This enhancement helps the model learn signal features in complex underwater conditions. Simulation results demonstrate that the underwater target azimuth estimation method based on PGR-SubspaceNet exhibits lower root mean square periodic error (RMSPE) values when handling different numbers of narrowband coherent sources. Under low SNR and limited snapshot conditions, its RMSPE values are significantly better than those of traditional methods and SubspaceNet-based enhanced subspace methods. PGR-SubspaceNet extracts more features, further improving the accuracy of direction-of-arrival estimation. Preliminary experiments in a pool validate the effectiveness and feasibility of the underwater target azimuth estimation method based on PGR-SubspaceNet.

Keywords:

SubspaceNet; azimuth estimation; subspace-based algorithms; residual module

1. Introduction

In array signal processing, direction-of-arrival (DOA) estimation is a traditional and highly valuable research technique. In many fields, such as communication, radar, and sonar, numerous researchers have studied it extensively and in depth [1,2,3]. The DOA estimation of underwater targets involves collecting acoustic signals through a multi-hydrophone array, thereby estimating the angle of incidence of the target. This estimation plays a crucial role in battlefield reconnaissance, underwater navigation, and marine development. However, because of the complex and variable underwater environment, inherent errors in hydrophone arrays, and limitations of the DOA estimation algorithms, most DOA estimation algorithms have some shortcomings in practical applications. Among these azimuth estimation algorithms, the most classic ones are subspace-based DOA estimation methods, such as the multiple signal classification (MUSIC) [4], root-multiple signal classification (Root-MUSIC) [5], and estimating signal parameters via rotational invariance techniques (ESPRIT) [6] algorithms. However, when dealing with coherent signals and small snapshot data, the performance of these methods significantly decreases [7]. Therefore, it is necessary to explore further and develop more-suitable DOA estimation techniques to improve the accuracy and robustness of DOA estimation.

Machine learning is an important component of artificial intelligence technology. As early as the 1980s, scholars began to explore the feasibility of applying machine learning technology to array-based target azimuth estimation. Neural networks were the earliest machine learning algorithms used to solve DOA estimations of targets. In reference [8], the problem of target direction arrival angle estimation was transformed into an optimization problem and mapped onto neural network nodes. Using techniques such as neural network optimization, the extreme points of the objective function are obtained to achieve DOA estimation. Compared with traditional subspace-based methods, this type of method avoids complex calculations such as matrix eigen-decomposition and peak search, and has outstanding advantages in computational efficiency. However, the basic principle of achieving azimuth estimation still continues in model-based signal-processing methods such as subspace methods. Later, some new neural networks, such as the Self Organizing Map (SOM) [9], Self organizing Fuzzy Inference Network (SONFIN) [10], and Radial Basis Function Neural Network (RBFNN) [11], were successively used in target azimuth estimation. However, neural networks have problems that are difficult to explain and also suffer from overfitting. Reference [12] transformed the DOA estimation problem into a multiclass classification problem. The eigenvectors of the sample covariance matrix were combined with a preknown target azimuth to form a pair of modeling sets for SVM training. Then, prediction of unknown targets can be achieved, which is a very inspiring method. Subsequently, many support vector machine-based target azimuth estimation methods have been proposed [13].

Deep learning (DL), a new type of machine-learning method, has sparked widespread research and applications in various fields since its proposal. DL can reveal complex nonlinear relationships between data/signals and labels, thereby achieving better model prediction performance. This has led to the emergence of data-driven (DD) DOA estimation methods, such as DOA estimators assisted by neural networks (NNs) [14]. Barthelme and Utschick [15] proposed using NNs to estimate the covariance matrix of the entire array from the sample covariance matrices of subarrays. This technique, by applying the MUSIC estimator to the reconstructed full covariance matrix, can achieve high-resolution estimation of more targets. This method resolves the basic dependency of MUSIC on the estimated covariance matrix, thereby improving the algorithm. However, it does not fully leverage the NN to improve the high-resolution capability of MUSIC, because the NN only uses the sample covariance matrix as the label for training and does not introduce the MUSIC algorithm during training. Merkofer et al. [16] proposed the deeply enhanced MUSIC algorithm, which is a model-based and data-driven DOA estimator. It uses specially designed neural structures to improve the performance and robustness of the MUSIC algorithm, locating coherent sources and estimating the number of coherent signal sources. Shmuel et al. [17] designed the SubspaceNet by augmenting the subspace, using a deep neural network (DNN)-based autoencoder to obtain a surrogate covariance matrix from the observed signals and combining it with subspace-based algorithms to estimate the target azimuth. However, current DD DOA estimation methods have been researched more in the field of electromagnetic-wave antenna arrays, while research in the field of underwater acoustic signal processing has been relatively scarce.

DOA estimation based on SubspaceNet performs well in handling coherent sources, low signal-to-noise ratio (SNR), small snapshot data, and good calibration mismatch performance, overcoming the limitations of model-based subspace methods. Inspired by SubspaceNet and addressing the application demands for underwater target detection and recognition in complex underwater environments, to achieve more comprehensive feature extraction and better utilize data information, an improved residual module and a global attention mechanism (GAM) based on SubspaceNet were employed in this study. The feature extraction capability was enhanced, further improving the accuracy of underwater target azimuth estimation with low SNR, coherent signals, and few snapshots. Its main contributions are as follows. We propose a residual block that fuses PConv and GAM: PConv-GAM Residual. On the one hand, this module is based on the ResNet residual block, and enables the network to adaptively focus on the important channel features, learn more complex feature representations, and improve the model representation ability by introducing the global attention mechanism. On the other hand, PConv-GAM Residual embeds the PConv convolution in front of the Conv 1 × 1 convolution of the residual block, which not only enables the network to pay more attention to the information of the center, but also significantly enhances the ability of the model to extract spatial features, which is helpful for learning the signal features under the complex hydroacoustic environment conditions. Therefore, PConv-GAM Residual can enhance the feature representation ability of the model and focus on different levels of features, thus improving the detection performance of the model.
We constructed the PGR-SubspaceNet network by embedding the PConv-GAM Residual in SubspaceNet. Efficient DOA estimation was achieved by employing a subspace-based algorithm for the surrogate covariance matrices generated by this network. In addition, the performance of PGR-SubspaceNet under different conditions, including different numbers of coherent sources, low signal-to-noise ratio, and finite snapshots, is explored in more depth, and the validity of its model is preliminarily verified.

By combining the above contributions, our proposed underwater coherent source DOA estimation model based on PGR-SubspaceNet aims to overcome the limitations of the model-based subspace method and further improve the accuracy of underwater coherent source DOA estimation.

2. Theoretical Basis

2.1. Array Signal Model

A homogeneous hydrophone linear array consisting of M elements is considered, with its steering vector denoted by

a (θ)

. It is assumed that K (K < M) far-field narrowband signals are incident on the array. The angles of incidence (DOAs) of these sources are expressed as a vector,

θ = [θ_{1}, \dots, θ_{K}]

. The array signal model for a uniform linear array [18] is shown in Figure 1.

The signal observed by the array at time t can be expressed as [19]:

x (t) = A (θ) s (t) + n (t), t = 1, \dots, N,

(1)

where the array steering matrix is represented as

A (θ) = [a (θ_{1}), a (θ_{2}), \dots, a (θ_{K})]

,

a (θ_{l})

is the directional

θ_{l}

steering vector, and Δ is the array element spacing.

Here,

s (t)

represents the response of the reference element to the target signal, and

n (t)

represents the additive white Gaussian noise received by the array element at time t. In addition,

N

is the number of snapshots. The statistical covariance matrix of the array output can be expressed as:

R = E (x x^{H}) = A R_{s} A^{H} + R_{N},

(2)

where

E (\cdot)

represents mathematical expectations,

^{H}

indicates conjugate transposition,

R_{s} = E (s (t) s {(t)}^{H})

is the signal covariance matrix, and

R_{N} = E (n (t) n {(t)}^{H})

is the noise covariance matrix.

In practice, one usually employs sample covariance matrices

{\hat{R}}_{X}

instead of real statistical covariance matrices

{\hat{R}}_{X} = \frac{1}{N} \sum_{t = 1}^{N} x (t) x^{H} (t)

.

The output of the array signal is represented in matrix form [20]:

X = A S + N .

(3)

2.2. Input Data and Labels

SubspaceNet employs a trainable architecture that is capable of mapping the input signal X to an estimate of the covariance matrix, denoted as the surrogate covariance matrix, irrespective of the snapshot count. The algorithm workflow can be delineated as three stages: (1) extracting the features essential for estimating the surrogate covariance task, (2) processing these extracted features utilizing a DNN-based autoencoder, and (3) postprocessing the autoencoder output to derive an estimated covariance matrix. An overview of the entire process is depicted in Figure 2.

The input data of the network use the empirical autocorrelation of

x (t)

as the input feature, i.e.,

{\hat{R}}_{X} [τ] = \frac{1}{T - τ} \sum_{t = 1}^{T - τ} x (t) x^{H} (t + τ),

(4)

where

τ \in \{0, \dots, \min (M, N - 1)\}

.

The training data are a dataset consisting of J pairs of observations

X

and their corresponding DOAs

D

. The dataset is randomly divided according to a ratio of training set to test set of 9:1:

D = {\{(X^{(j)}, θ^{(j)})\}}_{j = 1}^{J} .

(5)

This study is based on the trainable architecture of SubspaceNet. In other words, the surrogate covariance matrix processed by the DNN autoencoder is combined with the Root-MUSIC algorithm to obtain the target orientation, and these estimated orientations are compared with the true orientation in Equation (5) to calculate the training loss [21].

3. Residual SubspaceNet Target Azimuth Estimation Based on Deep Fusion of PConv and GAM

PConv-GAM Residual SubspaceNet

SubspaceNet consists of a DNN autoencoder consisting of nontrainable preprocessing and postprocessing. The autoencoder architecture used consists of three convolutional layers (Conv) with 16, 32, and 64 output channels, followed by three deconvolution layers with 32, 16, and 1 output channels. The kernel size for each layer is set to 2 × 2. The output of the autoencoder is a real matrix of 2M × M. To represent the surrogate covariance, the first M row is used as the real part and the remaining M lines are used as the imaginary part, which is reconstructed into a complex matrix

K

. A surrogate covariance can be constructed by using a complex matrix

K

and hyperparameter

ε

equal to 1:

\hat{R} = K K^{H} + ε I_{N} .

(6)

θ

can thus be estimated from

\hat{R}

by applying the subspace method.

To make the target information contained in the generated surrogate covariance matrix more accurate and enhance the accuracy of DOA estimation of underwater coherent sources, SubspaceNet needs to be optimized further. Under the same optimization conditions, a deeper NN is obviously more powerful in terms of training effect. However, DNNs face a serious overfitting problem caused by the increase in the number of parameters and the increase in model complexity, which is not conducive to practical applications. The classical ResNet introduces the concept of residual learning [22], which uses a shortcut connection, adding the inputs directly to the outputs. This makes the network easier to optimize and train and avoids the problem of overfitting if the network is too deep, as shown in Figure 3a. Therefore, to improve the feature extraction ability and reduce the depth of the model as much as possible, an attempt was made to embed the ResNet residual block in front of the first deconvolution layer of SubspaceNet.

Based on the above, to make fuller and more-effective use of the information from all channels, while reducing redundant computation and memory access to extract spatial features more efficiently, a partial convolutional layer (PConv) [23] is attached to the Conv 1 × 1 convolution of the ResNet residual block. In this way, the effective receptive field on the input feature map looks like a T-shaped convolutional layer, which makes the network more focused on the center than a conventional convolutional layer that treats patches evenly.

In addition, the DNN autoencoder in SubspaceNet relies on the idea of convolution operation and the use of local receptive fields to fuse spatial information and channel information, which can improve the representation ability of the network; however, certain parameters are necessary to adapt to these characteristics. The attention mechanism can improve the feature representation and expression ability of the model by telling the learning network [24] what to pay attention to and where it is located and obtain greater detection accuracy. The GAM [25] is an effective network structure, the main purpose of which is to enhance the interaction characteristics of the global dimension while reducing the loss of network information. The mechanism is further optimized on the basis of the Convolutional Block Attention Module (CBAM) to achieve better performance. GAM has further improved and optimized its submodules on this basis. It combines these improvements in attention mechanisms to introduce a more powerful global context interaction capability in the network. Through this mechanism, the network can better capture the correlation between different features, which improves the performance of the model in feature extraction and representation learning. Therefore, to improve the network representation ability further, more features are extracted. The PConv-GAM Residual structure is constructed by embedding the GAM module into the improved residual module described above, as shown in Figure 3b.

This network structure can be further expressed as:

Y = R e l u (B N (C o n v (G A M (R e l u (B N (C o n v (X)))))) + C o n v (P C o n v (X))) .

(7)

PConv-GAM Residual consists of two 3 × 3 convolutions, one PConv convolution, and one 1 × 1 convolution, which constitute the main source of computational cost, and other operations (e.g., BN, ReLU, and GAM) being relatively less computationally expensive. Therefore, the introduction of this structure does increase the complexity of the algorithm compared to the previous one, but this increase in complexity is mainly due to the additional convolutional layer, not due to BN, ReLU, or GAM. Further, this increase in complexity is relatively low compared to the approaches that employ a deeper network to improve the accuracy of the DOA estimation.

Therefore, through the introduction of PConv-GAM Residual, a residual subspace network based on the fusion of PConv convolution and GAM is proposed: PConv-GAM Residual SubspaceNet (PGR-SubspaceNet). Its structure is shown in Figure 4.

The network structure has fewer layers and lower complexity. By combining multiple types of modules, it is able to extract different layers and types of features from the input data. These features are crucial for DOA estimation of underwater coherent sources. In particular, the combination of the GAM module and the two different processing streams in PConv-GAM Residual helps the model better focus on the information related to the covariance matrix in the estimation of the DOA, thus improving the performance of the model. This design not only enhances the model’s ability to capture information, but also significantly improves its overall performance in the DOA estimation task.

4. Experiments and Analysis

The experimental platform was a 64-bit Windows 10 system, running on an 11th Gen Intel(R) Core(TM) [email protected] GHz with a 3.11 GHz processor, and the experimental software was Python version 3.8.8 and PyTorch 2.0.1 + CPU version. The network hyper parameters were set as a batch size of 256 and a learning rate of 0.001. The architecture was trained using the Adam optimizer [26], 50 epochs, and 40,000 samples. The other settings, such as the loss function, were consistent with those in a previous study [17].

To evaluate the effectiveness of the improved network structure, the root mean square periodic error (RMSPE) was used as an evaluation index [27] to measure the accuracy of DOA estimation. Coherent sources (AS2 is not satisfied), low SNR, and limited snapshots (AS4 is not maintained) were considered.

The subspace approach is based on the following assumptions [17]:

AS1: The source is narrowband.

AS2: The source is incoherent, i.e., the covariance matrix in Equation (2),

R_{s}

, is diagonal.

AS3: The array is calibrated so that the steering vector is known and accurately matches the array.

AS4: The number of snapshots is large enough and the signal-to-noise ratio is high enough that one can reliably estimate

{\hat{R}}_{X}

empirically in Equation (2).

4.1. Coherent Sources

The ability of PGR-SubspaceNet to process coherent sources efficiently was evaluated. The focus was on challenging, perfectly coherent cases where all sources exhibited the same phase and amplitude. Different numbers of narrowband coherence sources estimated from N = 100 snapshots with an SNR of 10 dB and an eight-element uniform linear array were simulated. The article compares and analyzes the PGR-SubspaceNet algorithm, SubspaceNet algorithm, classical subspace algorithms such as MUSIC, Root-MUSIC, and ESPRIT, as well as improved subspace smoothing algorithms such as SPS-MUSIC, SPS-Root-MUSIC, and SPS-ESPRIT.

Table 1 demonstrates that the RMSPE values based on the PGR-SubspaceNet are best among the many methods considered. This clearly indicates that the model has a smaller error in azimuth estimation and can better handle different numbers of narrowband coherent sources.

Only 40,000 training samples were used in this study, whereas 80,000 training samples were used in a previous study [17]. Nevertheless, the improved SubspaceNet model achieved a level of accuracy comparable to that of the original model. This shows that the improved SubspaceNet model can capture richer features and has stronger convergence performance. This also means that, with a smaller number of training examples, the model can achieve a satisfactory level of accuracy and exhibits excellent generalization ability.

4.2. Low Signal-to-Noise Ratio and Limited Snapshots

The DOA estimation capability of PGR-SubspaceNet for low SNRs and a limited number of snapshots was evaluated.

A case of two coherent sources, 200 snapshots, and an SNR of −5 dB was examined, and consistent with Table 1, the proposed PGR-SubspaceNet algorithm was compared with the SubspaceNet algorithm and the subspace-based DOA estimation algorithm.

The experimental results are shown in Table 2. Clearly, PGR-SubspaceNet had the lowest RMSPE when the SNR was low, indicating that the proposed method can estimate the DOA effectively, and the estimated results are more accurate.

The RMSPE for the DOA estimate versus the SNR is shown in Figure 5. The simulation conditions are consistent with Table 2, with SNRs ranging from −7 to 7 dB in steps of 2 dB.

As Figure 5 shows, the RMSPE of PGR-SubspaceNet is lower when the SNR changes. Overall, PGR-SubspaceNet significantly improves the ability of the original SubspaceNet to cope with high noise levels.

Table 3 lists the RMSPE values obtained for two coherent sources, two snapshots, and an SNR of 5 dB to verify the effectiveness of the PGR-SubspaceNet algorithm for limited snapshots. The results show that, even in extreme cases where AS2 and AS4 are violated, the surrogate covariance generated by PGR-SubspaceNet combined with the subspace-based DOA estimation algorithm can achieve a more reliable estimation.

4.3. Ablation Experiments

To verify the effectiveness of the PGR-SubspaceNet model proposed, experiments were carried out on the same dataset under the same conditions. These experiments consisted of comparing the original SubspaceNet model, gradually introducing the SubspaceNet versions of each improved module, and merging all of the improved modules into the SubspaceNet model.

In the experiment, the original SubspaceNet model is called A, and then the ResNet residual module is introduced on the basis of A, which is denoted as B. On this basis, the PConv module is further added to obtain C, and finally the GAM is embedded on the basis of C to obtain D, which is the method proposed. The experimental results are summarized in Table 4.

Table 4 shows that the RMSPE value of the constructed PGR-SubspaceNet model is significantly reduced by introducing the residual blocks that are deeply fused with PConv and GAM, and the DOA estimation accuracy of underwater coherent sources is improved.

5. Pool Experiments

The experimental data from the pool were used to validate the algorithm preliminarily. The pool was 20 m long, 8 m wide, and 7 m deep. The experiment used a vertically placed 10 uniform line array, and the first array was 0.7 m above the water surface. The transmitted signal frequency was a continuous-wave pulse of 3 kHz, the pulse length was 400 ms, and the period was 1 s. The distance of the transmitting transducer from the receiving array was 7 m. A comparison of the PGR-SubspaceNet, SubspaceNet, and MUSIC algorithms in the DOA estimation of the pool experiment in the case of a single source is shown in Figure 6.

As Figure 6 shows, the target azimuth estimation method based on PGR-SubspaceNet has a better spatial spectrum and estimation accuracy. Through the pool experiments, the effectiveness of the proposed algorithm and the feasibility of the application to actual sonar equipment were preliminarily demonstrated.

6. Conclusions

A residual block fusing PConv and GAM was introduced into DOA estimation based on SubspaceNet to construct PGR-SubspaceNet. In the optimization process, the network employs the residual structure to retain the essential feature information and further deepens the feature extraction through the GAM. This enables the network to learn superior mapping from the data to capture a more appropriate covariance matrix efficiently, which helps enhance the subspace approach for more-accurate DOA estimation. Experimental results show that PGR-SubspaceNet has faster convergence than SubspaceNet. In complex situations, such as coherent signals, low SNR, and limited snapshots, PGR-SubspaceNet shows better estimation accuracy than SubspaceNet, which improves the robustness of DOA estimation. Finally, pool experiments preliminarily demonstrated the effectiveness of the PGR-SubspaceNet method based on the augmented subspace and the feasibility of using it for underwater acoustic target azimuth estimation.

The proposed PGR-SubspaceNet method improves the accuracy of target azimuth estimation under complex situations, such as coherent signals, low SNR, and limited snapshots. However, the method also has some limitations, such as the need to train the corresponding model according to the actual array parameters and sources, which limits its applicability to different scenarios. Furthermore, for deployment on actual devices, the model must be optimized further.

Further study of the target azimuth estimation performance based on PGR-SubspaceNet under wideband signals is planned, along with continuous improvement of the robustness of the model. In addition, more measured data should be collected as soon as possible, such as different formation structures and different target scenarios, for more-detailed experimental validation. Finally, we will strive to make the PGR-SubspaceNet model more flexible in terms of scenario applicability by optimizing the model structure and algorithms, and significantly improving the computational efficiency while maintaining excellent detection performance, so as to make the model more practical and deployable in practical applications.

Author Contributions

Writing, T.G.; Software, Y.X.; Methodology, Y.B.; Investigation, S.D.; Validation, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Shaanxi Provincial Natural Science Basic Research Program under grant number 2024JC-YBMS-561. This research was also funded by Key points in Shaanxi Province R&D plan project under grant number 2024GX-YBXM-262, and National Natural Science Foundation of China under grant number 12004293.

Data Availability Statement

Data and results supporting the findings of this study can be obtained from the corresponding author upon reasonable request.

Conflicts of Interest

Authors Shaochun Ding and Yong Huang were employed by the company Ningbo BoHai ShenHeng Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, M.; Liu, Z.; Lu, W.; Chen, Y.; Gao, X.; Zhao, N. Distributed few-shot learning for intelligent recognition of communication jamming. IEEE J. Sel. Top. Signal Process. 2022, 16, 395–405. [Google Scholar] [CrossRef]
Yan, S.F.; Ma, Y.L. Sensor Array Beamforming Optimization Design and Application; Science Press: Beijing, China, 2009; pp. 5–20. [Google Scholar]
Zhan, X.C.; Sun, Z.W.; Shu, F.; Cheng, X.; Wu, Y.; Zhang, Q.; Zhang, P. Rapid Phase Ambiguity Elimination Methods for DOA Estimator via Hybrid Massive MIMO Receive Array. Chin. J. Electron. 2023, 33, 175–184. [Google Scholar] [CrossRef]
Schmidt, R.O. Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef]
Barabell, A. Improving the resolution performance of eigenstructure-based direction-finding algorithms. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Boston, MA, USA, 14–16 April 1983; ICASSP’83. IEEE: Boston, MA, USA, 1983; pp. 336–339. [Google Scholar]
Roy, R.; Kailath, T. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Signal Process. 1989, 37, 984–995. [Google Scholar] [CrossRef]
Wang, L. Research on DOA Estimation Algorithm Based on Nested Array. Master’s Thesis, Xi’an University of Electronic Science and Technology, Xi’an, China, 2021; pp. 1–4. [Google Scholar]
Chang, P.R.; Yang, W.H.; Chan, K.K. A neural network approach to MVDR beamforming problem. IEEE Trans. Antennas Propag. 1992, 40, 313–322. [Google Scholar] [CrossRef]
Xu, J.; Shen, X.; Mark, J.W.; Cai, J. Mobile location estimation for DS-CDMA systems using self-organizing maps. Wirel. Commun. Mob. Comput. 2007, 7, 285–298. [Google Scholar] [CrossRef]
Shieh, C.S.; Lin, C.T. Direction of arrival estimation based on phase differences using neural fuzzy network. IEEE Trans. Antennas Propag. 2000, 48, 1115–1124. [Google Scholar] [CrossRef]
Donelli, M.; Viani, F.; Rocca, P.; Massa, A. An innovative multiresolution approach for DOA estimation based on a support vector classification. IEEE Trans. Antennas Propag. 2009, 57, 2279–2292. [Google Scholar] [CrossRef]
Lima, C.A.M.; Junqueira, C.; Suyama, R.; Von Zuben, F.; Romano, J. Least-squares support vector machines for DOA estimation: A step-by-step description and sensitivity analysis. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, 31 July–4 August 2005; pp. 3226–3231. [Google Scholar]
Tarkowski, M.; Kulas, L. RSS-based DoA estimation for ESPAR antennas using support vector machine. IEEE Antennas Wirel. Propag. Lett. 2019, 18, 561–565. [Google Scholar] [CrossRef]
Grumiaux, P.A.; Kitić, S.; Girin, L.; Guérin, A. A survey of sound source localization with deep learning methods. J. Acoust. Soc. Am. 2022, 152, 107–151. [Google Scholar] [CrossRef] [PubMed]
Barthelme, A.; Utschick, W. DoA estimation using neural network-based covariance matrix reconstruction. IEEE Signal Process. Lett. 2021, 28, 783–787. [Google Scholar] [CrossRef]
Merkofer, J.P.; Revach, G.; Shlezinger, N.; van Sloun, R.J. Deep augmented music algorithm for data-driven doa estimation. In Proceedings of the ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 24–28 April 2022; pp. 3598–3602. [Google Scholar]
Shmuel, D.H.; Merkofer, J.P.; Revach, G.; van Sloun, R.J.; Shlezinger, N. SubspaceNet: Deep Learning-Aided Subspace Methods for DoA Estimation. arXiv 2023, arXiv:2306.02271. [Google Scholar]
Zhang, W.M. Research on Multi-Beam Imaging Sonar Simulation and Imaging Analysis. Master’s Thesis, Inner Mongolia University, Inner Mongolia, China, 2019; pp. 11–13. [Google Scholar]
Ma, S.H. Research on Target Azimuth Estimation Method Based on Two-Dimensional Deconvolution under Shallow Sea Multipath Effect. Master’s Thesis, Zhejiang University, Hangzhou, China, 2021; pp. 13–14. [Google Scholar]
Feng, X.A.; Kou, S.W.; Tan, W.J.; Yang, B. Sparse representation theory and its applications in underwater signal processing. Acta Electron. Sin. 2021, 49, 1840–1851. [Google Scholar]
Shlezinger, N.; Routtenberg, T. Discriminative and Generative Learning for the Linear Estimation of Random Signals. IEEE Signal Process Mag. 2023, 40, 75–82. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Chen, J.; Kao, S.H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.; Chan, S. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 12021–12031. [Google Scholar]
Zhao, B.; Wu, X.; Feng, J.; Peng, Q.; Yan, S. Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimed 2017, 19, 1245–1256. [Google Scholar] [CrossRef]
Liu, Y.; Shao, Z.; Hoffmann, N. Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Routtenberg, T.; Tabrikian, J. Bayesian parameter estimation using periodic cost functions. IEEE Trans. Signal Process. 2011, 60, 1229–1240. [Google Scholar] [CrossRef]

Figure 1. Signal model.

Figure 2. Overall flow of the SubspaceNet trainable architecture [17].

Figure 3. Residual block and PConv-GAM residual.

Figure 4. Schematic diagram of PGR-SubspaceNet.

Figure 5. Comparison of DOA-estimated RMSPE with SNR.

Figure 6. Comparison of PGR-SubspaceNet, SubspaceNet, and MUSIC algorithms for DOA estimation in the pool experiment.

Table 1. Coherent sources: DOA accuracy.

Method	RMSPE (Rad) of 2 Sources	RMSPE (Rad) of 3 Sources	RMSPE (Rad) of 4 Sources
MUSIC	0.171865	0.386694	0.392426
SPS-MUSIC	0.027656	0.095822	0.160776
Root-MUSIC	0.216442	0.403290	0.402857
SPS-Root-MUSIC	0.013216	0.074491	0.155015
ESPRIT	0.452303	0.430512	0.393098
SPS-ESPRIT	0.012522	0.070224	0.155015
SubspaceNet	0.012542	0.108061	0.195121
PGR-SubspaceNet	0.003884	0.019208	0.176743

Table 2. Two coherent sources (200 snapshots, SNR = −5 dB): DOA accuracy.

Method	RMSPE (Rad)
MUSIC	0.192560
SPS-MUSIC	0.144621
Root-MUSIC	0.228844
SPS-Root-MUSIC	0.126053
ESPRIT	0.454154
SPS-ESPRIT	0.125196
SubspaceNet	0.045459
PGR-SubspaceNet	0.025440

Table 3. Two coherent sources (two snapshots, SNR = 5 dB): DOA accuracy.

Method	RMSPE (Rad)
MUSIC	0.335811
SPS-MUSIC	0.162200
Root-MUSIC	0.408542
SPS-Root-MUSIC	0.139074
ESPRIT	0.511599
SPS-ESPRIT	0.135919
SubspaceNet	0.048456
PGR-SubspaceNet	0.040022

Table 4. Ablation experiments.

Method	RMSPE (Rad)
A	0.015865
B	0.008512
C	0.006403
D	0.005874

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, T.; Xu, Y.; Bi, Y.; Ding, S.; Huang, Y. Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet. Electronics 2024, 13, 2171. https://doi.org/10.3390/electronics13112171

AMA Style

Guo T, Xu Y, Bi Y, Ding S, Huang Y. Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet. Electronics. 2024; 13(11):2171. https://doi.org/10.3390/electronics13112171

Chicago/Turabian Style

Guo, Tuo, Yunyan Xu, Yang Bi, Shaochun Ding, and Yong Huang. 2024. "Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet" Electronics 13, no. 11: 2171. https://doi.org/10.3390/electronics13112171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet

Abstract

1. Introduction

2. Theoretical Basis

2.1. Array Signal Model

2.2. Input Data and Labels

3. Residual SubspaceNet Target Azimuth Estimation Based on Deep Fusion of PConv and GAM

PConv-GAM Residual SubspaceNet

4. Experiments and Analysis

4.1. Coherent Sources

4.2. Low Signal-to-Noise Ratio and Limited Snapshots

4.3. Ablation Experiments

5. Pool Experiments

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI