Next Article in Journal
Evaluating Pile-Supported Embankment Considering the Soil Anisotropy Effect
Previous Article in Journal
An Animated Visualization Method for Large-Scale Unstructured Unsteady Flow
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gaussian Mixture Model for Marine Reverberations

1
Department of Automation, Hangzhou Dianzi University, Xiasha Higher Education Zone, Hangzhou 310018, China
2
Underwater Test and Control Technology Key Laboratory, Dalian Test and Control Technology Institute, Zhongshan District, Dalian 116013, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(21), 12063; https://doi.org/10.3390/app132112063
Submission received: 15 September 2023 / Revised: 31 October 2023 / Accepted: 1 November 2023 / Published: 6 November 2023
(This article belongs to the Section Acoustics and Vibrations)

Abstract

:
Ocean reverberations, a significant interference source in active sonar, arise as a response generated by random scattering at the receiving end, a consequence of randomly distributed clutter or irregular interfaces. Statistical analysis of reverberation data has revealed a predominant adherence to the Rayleigh distribution, signifying its departure from specific distribution forms like the Gaussian distribution. This study introduces the Gaussian mixture model, capable of simulating random variables conforming to a wide array of distributions through the integration of an adequate number of components. Leveraging the unique statistical attributes of reverberation, we initiate the Gaussian mixture model’s parameters via the frequency histogram of the reverberation data. Subsequently, model parameters are estimated using the expectation–maximization (EM) algorithm and the most suitable statistical model is selected based on robust model selection criteria. Through a comprehensive evaluation that encompasses both simulated and observed data, our results underscore the Gaussian mixture model’s effectiveness in accurately characterizing the distribution of reverberation data, yielding a mean squared error of less than 4‰.

1. Introduction

Active sonar systems employ transducers to emit specific waveform acoustic signals. Typical sonar signal frequencies are categorized as low (1 kHz–10 kHz), medium (10 kHz–50 kHz), and high frequencies (50 kHz–several hundred kHz), with each tailored to distinct applications. Waveforms, including pulsed waveforms, continuous waves, and frequency–modulated continuous waves, are selected to match the specific use case. Signal duration is adapted to application requirements, utilizing short pulses for target detection and longer signals for imaging or communication. The depth of signal penetration and range distance are contingent upon the interaction between frequency, water properties, and signal strength. Low frequencies penetrate deeply but have a limited range, whereas high frequencies offer an extended range with reduced penetration. Upon detecting a target, these systems generate echo signals at specific angles, which are subsequently received by hydrophones. These echo signals are frequently intertwined with a significant amount of ambient ocean noise and reverberation. Unlike ambient ocean noise, reverberation constitutes a distinctive physical phenomenon induced by the signals emitted by active sonar. It grows increasingly intricate with amplified signal strength and the multiplication of scattering elements, particularly in shallow water environments where multiple scattering effects are pronounced. Consequently, the processing of reverberation signals presents greater complexity, posing formidable challenges to active sonar technologies encompassing target detection, localization, and identification, among other functions [1,2].
From a statistical perspective, reverberation can be viewed as a non–stationary stochastic process, essentially stemming from the stochastic scattering response at the receiving end generated by randomly distributed scatterers or randomly irregular interfaces. The statistical model for reverberation was initially proposed by Faure [3], and subsequently, B. Olishevski and Middleton conducted further research on the reverberation model, referring to this theory as the FOA (First Order Ambisonics) reverberation theory model [4,5]. To address the challenge of parameter estimation in the presence of missing data, Arthur Dempster and colleagues introduced the expectation–maximization (EM) algorithm, which significantly reduces computational complexity by transforming the maximization of the likelihood function into an optimization problem involving expected values and maximization [6]. Wei HK improved the Greedy EM algorithm in the context of image processing and successfully applied it to the one–dimensional Gaussian mixture model (GMM) modeling of underwater reverberation [7]. Furthermore, the GMM is frequently employed in various domains such as cluster analysis, acoustic modeling, image segmentation, and feature extraction. Wang PB effectively modeled ocean reverberation data using a Symmetric Alpha–Stable (SαS) distribution model that adheres to a zero–mean, unimodal bell–shaped distribution [8]. Liu WS intuitively demonstrated the performance differences between traditional algorithms and the Greedy EM algorithm through numerical simulation examples [9]. To mitigate the issue of the clustering effect of the EM algorithm relying too heavily on the initial probability density center, Liu M proposed an improved EM algorithm based on the Fuzzy C–means algorithm for parameter initialization, which exhibits superior performance [10]. Fatma Najar employed the GMM, Generalized Gaussian Mixture Models (GGMMs), Bounded Gaussian Mixture Models (BGMMs), and Bounded Generalized Gaussian Mixture Models (BGGMMs) for multidimensional data clustering and assessed the robustness of the models [11]. Wen H introduced asymmetric Gaussian mixture models into finite mixture models to simulate more complex asymmetric distributions [12]. Mateusz Przyborowski presented an approximate method for the parameter learning of Gaussian mixture models in large datasets using the EM algorithm [13].
From Figure 1, it is evident that the reverberation data exhibit characteristics such as approximate zero mean, roughly equal positive and negative sample sizes, and nearly symmetrical upper and lower envelopes. Various distribution models, including Gaussian distribution, Gaussian mixture distribution, and SαS distribution, can be utilized for fitting and modeling. Despite Gaussian mixture distribution having more parameters than Gaussian and SαS distributions, it is capable of statistically modeling non–Gaussian data with non–zero mean and multiple bell shapes. Therefore, the GMM exhibits broader applicability, particularly in the context of reverberation data. Drawing from the fundamental theory of reverberation and statistical distribution characteristics, this paper initializes the parameters of GMM models corresponding to reverberation data. It employs the EM algorithm to iteratively generate models for different cluster numbers. Building upon various evaluation criteria, a statistical modeling approach for reverberation based on the Gaussian reverberation model is proposed. This method offers valuable support for investigating the characteristics of ocean reverberation information and advancing active sonar technology.

2. Theoretical and Statistical Distribution Characteristics of Reverberation

In shallow–water environments, the presence of non–uniformities in the ocean’s surface and seafloor, coupled with the abundance of scattering objects, results in the non–continuity of the physical properties of the oceanic medium. When sound waves traverse these non–uniform regions during underwater propagation, they undergo partial reflection, generating scattering. The cumulative scattering stemming from all scattering objects is termed “reverberation”. Ocean reverberation encompasses three distinct components: surface reverberation, seabed reverberation, and volume reverberation, with the first two collectively referred to as “interface reverberation” [14,15]. Despite reverberation arising from the amalgamation of echoes produced by a substantial number of chaotic scattering objects, these echoes originate from the same excitation source, endowing reverberation data with unique statistical characteristics. Middleton’s reverberation statistical model simplifies the representation of sound scattering non–uniformity in the ocean. It conceptualizes scattering objects as embedded within the seafloor or floating on the sea surface and within the seawater [16]. This model assumes their independence from one another while disregarding secondary and higher–order scattering effects [17,18].
Assuming the transducer emits a pulse signal represented by s ( t ) , the sound pressure due to reverberation at time t can be expressed as follows [18]:
p t = n = 1 N g ( r n ) f ( r n ) α n s 0 t t n e j ω 0 t t n + ψ n t t n + ϕ n ,
where g ( r n ) denotes the count of scattering objects within the spatial microelement v n situated at position r n . The term f ( r n ) signifies the round–trip propagation attenuation factor for the scattering echoes originating from the scattering objects within v n , while t n corresponds to the arrival time of the echo. The parameter N represents the overall count of scattering spatial microelements that contribute to time t . Let
Re p t = x t cos ω 0 t y t sin ω 0 t .
For the surface reverberation p s ( t ) , as x s t and y s ( t ) follow zero–mean Gaussian distributions, the amplitude r s t = x s 2 t + y s 2 ( t ) 1 / 2 of the reverberation also follows a Rayleigh distribution [18], with probability density being as follows:
p r = r s t σ s 2 t exp r s 2 t σ s 2 t ,
where σ s 2 ( t ) = E x s 2 ( t ) = E y s 2 ( t ) , representing the average intensity of the reverberation. Similarly, it can be deduced that the amplitude of the volume reverberation p v ( t ) also follows a Rayleigh distribution [18].
For the seabed reverberation, as the scattering objects are fixed, the reverberation sound pressure is a periodic signal [18]:
p b t = r b t e f ω 0 t + ϕ 0 t .
The total reverberation is as follows:
p c ( t ) = p s ( t ) + p v ( t ) + p b ( t ) ,
the amplitude r c ( t ) of p c ( t ) follows a modified Rayleigh distribution, also known as the Rice distribution:
p ( r ) = r ( t ) σ c 2 ( t ) exp r 2 ( t ) + r b 2 ( t ) 2 σ j 2 ( t ) I 0 r ( t ) + r b ( t ) σ j 2 ( t ) ,
where σ j 2 t = σ s 2 t + σ v 2 t , and I 0 is the zero–order modified Bessel function [18].
The above analysis shows that reverberation is not a stationary random process. Its intensity decays rapidly over time. At each fixed time t , the amplitude of the reverberation follows a Rice distribution. If the seabed reverberation is neglected, then the amplitude of the reverberation follows a Rayleigh distribution. Statistical modeling can be used to describe the reverberation data. Currently, typical distributions used for this purpose include Gaussian distribution, SαS distribution, and Gaussian mixture distribution, all of which can describe data with similar statistical characteristics using their probability density function (PDF) [19].

3. Statistical Modeling of Ocean Reverberation Data Based on the Gaussian Mixture Model (GMM) Method

3.1. Gaussian Mixture Model (GMM) and Its Parameter Estimation Method (EM Algorithm)

The Gaussian mixture model is a linear combination of multiple Gaussian distributions with the following probability distribution model:
f x ; λ , μ , σ = k = 1 K λ k p k ( x ; μ k , σ k 2 ) = k = 1 K λ k 2 π σ k 2 exp ( x μ k ) 2 2 σ k 2 ,
where K is the number of individual Gaussian models in the mixture model, also known as the cluster number or model order. λ k represents the mixture weight, satisfying 0 < λ k < 1 and k = 1 K λ k = 1 . When λ = 1 , the model degenerates into a Gaussian model. p k represents the k t h Gaussian component, while μ k and σ k 2 represent the mean and variance of the distribution, respectively. In theory, if the number of Gaussian models fused by a certain Gaussian mixture model is large enough, and the weights set between them are reasonable enough, the Gaussian mixture model can fit any distribution [20].
The EM algorithm is used for parameter estimates in the Gaussian mixture model with latent variables. Assuming that the observed dataset is X = x 1 , x 2 , , x N , each data point x i is independent, and the latent parameters are Z = z 1 , z 2 , , z N , where z i indicates the probability that the sampling point x i comes from a certain Gaussian distribution. Given the initial parameter value Θ ( 0 ) = λ k , μ k , σ k , the iterative solution for maximizing the likelihood function of X is employed to determine the parameters Θ that optimize this likelihood function.

3.2. Improved EM Parameter Estimation Method

In the context of parameter estimation using the EM algorithm, it is crucial to predefine the number of clusters ( K ), means, and variances. Without proper initialization of these parameters, the EM algorithm is susceptible to converging towards local optima or even experiencing convergence failures [21]. In this study, we adopt a systematic approach to address this issue. Initially, we initialize the number of clusters, means, and variances by leveraging the frequency histogram of the reverberation data. Subsequently, we employ the EM algorithm for iterative parameter estimation, covering various cluster numbers. Ultimately, the selection of the most suitable model is based on rigorous evaluation metrics. The detailed algorithmic workflow is visually presented in Figure 2.

3.2.1. Parameter Initialization Based on Reverberation Data

With an ample sample size, the histogram outcomes can be deemed representative of the actual distribution. The steps for initializing data using the frequency histogram (FH) are outlined as follows:
(1) Assuming that K B represents the optimal number of clusters, the process involves identifying the local maxima and minima of each bell curve. The abscissa of the local maximum, denoted as μ k , corresponds to the mean of the associated Gaussian component. The initial cluster number is established as K H = k .
(2) Determine the minimum extreme value difference, denoted as h min , among all bell–shaped curves. The sample width, represented as e k , corresponds to the interval spanned by the samples encompassed within h min , with the center being set as the mean of the respective Gaussian component. When K B = K H , the weight λ k for each Gaussian distribution can be estimated utilizing e k :
λ k = e k k = 1 K H e k .
(3) Utilizing μ 1 and μ K H as segmentation boundaries to separate the first and last bell–shaped curves, we determine σ 1 and σ K H by employing the 3 σ principles. Subsequently, we initialize the corresponding σ values based on the ratio of each bell curve’s maximum value to its area.
(4) When K B > K H , in accordance with steps (1)~(3), considering the statistical attributes that a significant portion of the reverberation data conforms to a zero–mean distribution, the Gaussian component characterized by λ = λ max is partitioned into O segments ( O = B H + 1 ) . Within these segments, μ o is set to 0, λ o equals 1 / λ max , and σ is established as 1.
As illustrated in Figure 3, the abscissa of the maximum point of each bell–shaped curve, denoted as μ 1 , μ 2 , and μ 3 , and the abscissa of the minimum points w 1 and w 2 , are acquired from the PDF curve. In this scenario, where K H equals 3 and h min equals h 3 , the sample widths are e 1 , e 2 , and e 3 , respectively. Notably, K B is equivalent to K H , which is 3 in this case. The weights for each Gaussian distribution, λ k , are determined as λ k = e k e 1 + e 1 + e 3 , where k spans from 1 to 3. Subsequently, the variances of the Gaussian components, σ 1 , σ 3 , and σ 2 , are calculated using s 1 and s 3 .

3.2.2. GMM Parameter Estimation Based on EM Algorithm

Employing the EM algorithm based on parameter estimation for the GMM, the logarithmic likelihood function [22] in Equation (4) is as follows:
L x Θ = i ln k ω i , k p x i z = k , μ k , σ k p z = k ω i , k ,
here, ω i , k represents the posterior probability determined via Bayes’ Rule, and α k denotes the prior distribution of z :
ω i , k = p z = k x i , μ k , σ k .
The specific steps for iteratively updating Θ are as follows:
E–step: Compute the posterior probability for each sample’s affiliation with model k utilizing the Gaussian mixture distributions and the prior probabilities acquired following each iteration. Subsequently, derive the most current expression for the objective function.
Q Θ , Θ t = i k ω i , k t ln α k ln ω i , k t ln 2 π σ k 2 x i μ k 2 2 σ k 2 ,
α k is the prior distribution of z .
M–step: Determine the estimated parameters of the GMM by maximizing the objective function, resulting in updated formulations for α k t + 1 , μ k t + 1 , and σ k 2 t + 1 :
α k t + 1 = i ω i , k t N , μ k t + 1 = i ω i , k t x i i ω i , k t , σ k 2 t + 1 = i ω i , k x i μ k t + 1 2 i ω i , k .

3.2.3. Model Evaluation

The Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) [23] are utilized to ascertain the suitability of the fitted model as the optimal one [24,25]. The formal definitions for these criteria are as follows
AIC = 2 k 2 ln L , BIC = l n n k 2 ln L .
Here, k is the number of parameters, L is the likelihood function, and n is the sample size. The AIC and BIC serve as statistical criteria for model selection, with smaller values indicating superior fitting results. Both criteria incorporate penalty terms that account for the number of model parameters, but it is worth noting that the BIC employs a larger penalty term compared to the AIC. This discrepancy becomes particularly relevant in cases with an abundance of samples or when the model exhibits excessive complexity. The AIC might be prone to overfitting under these circumstances, while the BIC effectively mitigates the risk of over–complex models. To determine the optimal model, various sets of model parameters Θ are obtained through iterative processes with different K–values using K H as a reference. Subsequently, the AIC and BIC are employed to ascertain if the model under consideration best fits the data.

4. Simulation and Experiments Analysis

Figure 4a presents the generation of non–Gaussian random sequences using the parameter λ i , μ i , σ i 2 = [ 0.4 , 0 , 1 ; 0.6 , 0 , 4 ] . In Figure 5a, simulated reverberation data are depicted, where the excitation signal is a Linear Frequency Modulated (LFM) signal with a pulse width of 2 milliseconds, a frequency range spanning 60–100 kHz, and a sampling frequency of 250 kHz. Figure 6a showcases a continuous wave (CW) signal with a central frequency of 4 kHz and a sampling frequency of 25 kHz. Figure 4b,c, Figure 5 and Figure 6a,b display PDF comparison plots for various models fitting the reverberation signals. Figure 4c, Figure 5 and Figure 6a–c depict comparative mean square error plots for the fitted PDF results using various models. The thick solid black line represents the true PDF curve drawn using the specified parameters. The thick red dashed line illustrates the PDF obtained through frequency histogram statistics, which can be considered an approximation of the true PDF. The pink curve denotes the PDF fitted with a Gaussian distribution and is labeled as G–D. Employing the logarithmic moment method, the PDF curve for the SαS distribution is represented in navy blue and labeled as SαS–D. The light blue curve showcases the fitting results of the GMM [26] and is denoted as GM–D. Table 1 documents the fitting results and error statistics for non–Gaussian random sequences, while Table 2 provides parameter estimation and error statistics for simulated reverberation data of the FLM signal, and Table 3 furnishes parameter estimation and error statistics for simulated reverberation data of the CW signal.
Mean squared error (MSE) serves as a key metric for assessing the disparity between the actual ground truth values and the estimated values derived from the established model. MSE, as formally defined, quantifies this discrepancy as follows:
MSE = 1 N n = 1 N ( θ θ ^ ) 2 ,
where θ is the ground truth PDF value, θ ^ is the PDF value estimated by the model, and N is the number of snapshots. Smaller MSE indicates that the predictive model is more accurate.
For the sake of model simplification, in cases where the MSE difference between two Gaussian mixture models with varying K–values is less than 5%, we consider the model with the higher K–value as exhibiting signs of overfitting.
When fitting simulated data using the GMM, preprocessing based on frequency histograms was performed with an initial cluster number of K H = 1 . As shown in Figure 4b, Figure 5, Figure 6a,b, the fitting results of Gaussian distributions noticeably deviate from the true PDF, while the PDF curves of other distributions almost perfectly overlap with the curve representing the true PDF. Similarly, these results are consistently observed in Figure 7c, Figure 8, Figure 9 and Figure 10a–c. Data in Table 1, Table 2 and Table 3 suggest that the MSE for Gaussian mixture and SαS distributions is approximately one–tenth of that for Gaussian distributions. It can be inferred that for one–dimensional, zero–mean, single–peaked non–Gaussian data generated by different modulation signals (CW and FLM), SαS and Gaussian mixture distributions exhibit excellent fitting capabilities, whereas the fitting results of Gaussian distributions fall far short of expectations.

5. Verification Based on the Measured Data

5.1. Method Validation

Experiment 1 was conducted at Moganshan Lake in Huzhou, China, located at latitude 30.5425° and longitude 119.9774°. This test site is specialized for conducting lake–based environmental tests, featuring a water depth of approximately 8 m and a rocky lakebed. An active sonar system was employed, equipped with independent transmitter and receiver components. The experimental setup included a Uniform Linear Array (ULA) consisting of four hydrophones, with a sampling frequency of 250 kHz and a sampling duration of 0.26 s per acquisition. The transmitted signal used a Linear Frequency Modulation (LFM) signal with a frequency range of 2 to 4 kHz. The objective of the experiment was a spherical object, with the aim of capturing its motion trajectory. Experiment 2 was carried out in the marine area near Dalian, China, situated at latitude 38.9140° and longitude 121.6146°, representing a typical marine environmental testing ground. This region is characterized by a water depth of approximately 70 m, substantial seabed sediment accumulation, and a complex environmental profile. An active sonar system with a co–located transmitter and receiver configuration was employed. In this experiment, a Uniform Linear Array (ULA) consisting of 27 hydrophones was utilized, with element spacing set at 5 mm. The ULA had a sampling frequency of 1000 kHz, and the sampling duration for each session was 0.05 s. The transmitted signal also employed a Linear Frequency Modulation (LFM) signal with a pulse width of 2 milliseconds and a frequency range of 100 to 200 kHz. The experiment targeted an unmanned underwater vehicle (UUV) model with the objective of capturing its dynamics. The typical sound velocity in water was around 1500 m per second, which is a common characteristic of complex underwater datasets. Additionally, amplitude normalization was applied to the data before modeling. It is worth noting that the Dalian region experiences complex sea conditions, abundant underwater scatterers, and strong reverberation interference, which was confirmed by subsequent waveform analysis.
Figure 7a displays the waveform of the measured reverberation data in Experiment 1, while Figure 8a, Figure 9 and Figure 10a showcase the waveforms of three distinct segments of Experiment 2’s measured reverberation data. In Figure 7b, Figure 8, Figure 9 and Figure 10a,b, PDF curves fitted by various models are presented. Figure 7c, Figure 8, Figure 9 and Figure 10a–c depict comparative mean square error plots for the fitted PDF results using various models. The gray bars within the figures represent the frequency histogram of the reverberation data, with the red dashed line depicting the PDF curve derived statistically from the frequency histogram, thus representing the true model values. The PDF curve for the Gaussian distribution is depicted by a pink curve labeled as G–D in the figures. The PDF curve fitted by the GMM is represented in sky blue and marked as GM–D in the figures, while the dark blue curve corresponds to the PDF of the SαS distribution and is denoted as SαS–D in the figures.

5.2. Analysis of Results

Table 4 provides an overview of the values for K H , K AIC , K BIC , and K B , and MSE corresponding to the four distinct sets of reverberation data. In Table 5, the parameter estimation results for modeling the reverberation data using various models are detailed. Table 6 focuses on the mean squared error (MSE) associated with each model. Analysis of Table 4 reveals that the optimal order for the GMM obtained through both the AIC and BIC is consistent. Specifically, concerning the reverberation data presented in Figure 6a, the AIC algorithm determines the optimal GMM order to be 6, with a marginal 1.6‰ difference in MSE when compared to the GMM model with a cluster number of 3. For the reverberation data in Figure 7a, the MSE of fitting results with the GMM orders 5 and 3 varies by a slight 1.2‰. In the dataset featured in Figure 9a, the GMM fitting results indicate a minuscule 0.7‰ difference in MSE between clusters 6 and 5. In all these instances, the MSE remains below 5‰, and for the sake of model simplification, the smaller K–value is favored as the optimal model order.
Observing the PDF curve comparison charts in Figure 6b, Figure 7, Figure 8 and Figure 9a,b, it becomes evident that the reverberation data acquired in a complex measured environment exhibit substantial deviations in the fitting results when employing a Gaussian distribution. For the reverberation data conforming to a zero mean and a single bell–shaped distribution, as depicted in Figure 6b and Figure 7b, both the Gaussian Mixture Model (GMM) and SαS distributions exhibit excellent fitting capabilities for the data’s PDF curve, with MSE values of less than 3‰. However, when dealing with reverberation data characterized by multiple peaks and non–zero mean, as demonstrated in Figure 8b and Figure 9b, the SαS distribution can only adequately fit the data associated with the primary peak, resulting in an error exceeding 5‰. Conversely, the PDF curve fitted by the GMM aligns closely with the true value. Table 6 data further emphasize the suitability of the GMM for reverberation fitting, as they demonstrate a maximum error of 3.1‰. In contrast, the SαS distribution exhibits a maximum error of 15.9‰, while the Gaussian distribution’s error exceeds 10‰, with a maximum reaching 152‰. Similarly, these results are consistently observed in Figure 7c, Figure 8, Figure 9 and Figure 10a–c, highlighting the minimal error associated with the GMM and the maximum error exhibited by the Gaussian model. These findings underscore the superiority of the GMM in accurately modeling reverberation data in diverse scenarios.

6. Conclusions

In this study, the GMM was employed to statistically characterize the distribution characteristics of reverberation data. Prior to applying the expectation–maximization (EM) algorithm for parameter estimation, data preprocessing was utilized to mitigate the limitations associated with random parameter initialization, which can lead to convergence towards suboptimal solutions and require extensive computational resources. Through a systematic comparison of different cluster numbers, we effectively addressed the issue of overfitting in the Akaike Information Criterion (AIC) algorithm within an acceptable error margin, consequently reducing the model’s complexity. The validation, using both simulated and real measured reverberation data, demonstrated that both the SαS distribution and GMM models offer robust modeling capabilities for single–peaked, zero–mean reverberation data. In this context, the mean squared error (MSE) of the GMM was less than 4‰, representing less than a tenth of the MSE achieved by a Gaussian distribution. However, when dealing with reverberation data exhibiting multi–peaked distributions and non–zero means, the GMM outperformed other distributions in terms of probability density fitting. Specifically, the MSE of the GMM was less than 2‰, whereas the SαS distribution exceeded 8‰, and the Gaussian distribution exceeded 24‰. The experimental results clearly demonstrate that the GMM offers superior probability density fitting for measured reverberation data in complex environments, showcasing its broader applicability.

Author Contributions

Conceptualization, T.S. and X.Z.; methodology, Y.W. and B.J.; validation, T.S., Y.W. and M.Z.; formal analysis, M.Z.; writing—original draft preparation, T.S. and Y.W.; writing—review and editing, M.Z.; funding acquisition, T.S., X.Z. and B.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Joint National Natural Science Foundation of China (No. U22A2044) and the funding from the Extension Fund from Underwater Test and Control Technology Key Laboratory (No. YS24071802).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

EMExpectation–maximization
FOAFirst Order Ambisonics
GMMGaussian Mixture Model
SαSSymmetric Alpha–Stable
GGMMGeneralized Gaussian Mixture Model
BGMMBounded Gaussian Mixture Model
BGGMMBounded Generalized Gaussian Mixture Model
PDFProbability density function
FHFrequency histogram
LFMLinear Frequency Modulation
CWContinuous wave
AICAkaike Information Criterion
BICBayesian Information Criterion
MSEMean squared error

References

  1. Wang, L.; Wang, Q. The influence of marine biological noise on sonar detection. In Proceedings of the 2016 IEEE/OES China Ocean Acoustics (COA), Harbin, China, 9–11 January 2016. [Google Scholar]
  2. Tian, T. Shengna Jishu, 2nd ed.; Harbin Engineering University Press: Harbin, China, 2009. [Google Scholar]
  3. Faure, P. Theoretical Model of Reverberation Noise. J. Acoust. Soc. Am. 1964, 36, 259–266. [Google Scholar] [CrossRef]
  4. Olishevski, B. Statistical Characteristics of Sea Reverberation, 2nd ed.; Science Press: Beijing, China, 1977. [Google Scholar]
  5. Middleton, D. New physical–statistical methods and models for clutter and reverberation: The ka–distribution and related probability structures. IEEE J. Ocean. Eng. 1999, 24, 261–284. [Google Scholar] [CrossRef]
  6. Pernkopf, F.; Bouchaffra, D. Genetic–based EM algorithm for learning gaussian mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1344–1348. [Google Scholar] [CrossRef] [PubMed]
  7. Wei, H.; Wang, P. Gaussian mixture model for reverberation. Tech. Acoust. 2007, 26, 514–518. [Google Scholar]
  8. Wang, P.; Wei, H.; Lou, L. Oceanic reverberation probability density modeling based on symmetric alpha–stable distribution. J. Harbin Eng. Univ. 2021, 42, 55–60. [Google Scholar]
  9. Liu, W.; Wang, P.; Gu, X. Comparison of two EM algorithms for gaussian mixture parameter estimation. Tech. Acoust. 2014, 33, 539–543. [Google Scholar]
  10. Liu, M.; Yu, Z. An improved expectation–maximum algorithm. J. Jilin Univ. (Sci. Ed.) 2022, 60, 1176–1182. [Google Scholar]
  11. Najar, F.; Bourouis, S.; Bouguila, N. A comparison between different gaussian–based mixture models. In Proceedings of the 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), Hammamet, Tunisia, 30 October–3 November 2017. [Google Scholar]
  12. He, W.; Yu, R.; Zheng, Y.; Jiang, T. Image denoising using asymmetric gaussian mixture models. In Proceedings of the 2018 International Symposium in Sensing and Instrumentation in IoT Era (ISSI), Shanghai, China, 6–7 September 2018. [Google Scholar]
  13. Przyborowski, M.; Ślęzak, D. Approximation of the expectation–maximization algorithm for gaussian mixture models on big data. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Kyoto, Japan, 13–16 December 2022. [Google Scholar]
  14. Ma, B.; Gong, L.; Chen, X.; Liu, G. Study of characteristics of acoustic intensity in acoustic vector ocean reverberation based on CW pulse. J. Nav. Univ. Eng. 2022, 34, 102–106. [Google Scholar]
  15. Ivakin, A.N.; Williams, K.L. Midfrequency acoustic propagation and reverberation in a deep ice–covered arctic ocean. J. Acoust. Soc. Am. 2022, 152, 1035–1044. [Google Scholar] [CrossRef] [PubMed]
  16. Cao, F.; Zhang, X.; Han, J. Experimental analysis of statistical property of low frequency reverberation envelope in shallow water. In Proceedings of the 2021 OES China Ocean Acoustics, Heilongjiang, China, 14–17 July 2021. [Google Scholar]
  17. Wang, J.; Wang, C.; Cheng, T. Active sonar reverberation suppression based on beam space data normalization. In Proceedings of the 2017 IEEE International Conference on Signal Processing, Communications and Computing, Xiamen, China, 22–25 October 2017. [Google Scholar]
  18. Li, Q. Introduction to Sonar Signal Processing, 2nd ed.; Chinese Academy of Sciences: Beijing, China, 2000. [Google Scholar]
  19. Glodek, M.; Schels, M.; Schwenker, F. Ensemble gaussian mixture models for probability density estimation. Comput. Stat. 2013, 28, 127–138. [Google Scholar] [CrossRef]
  20. Guo, H.; Chu, F.; Zhu, D. Research on gaussian mixture auto–regressive reverberation modeling and whitening algorithm. In Proceedings of the 2021 IEEE International Conference on Signal Processing, Communications and Computing, Xi’an, China, 17–20 August 2021. [Google Scholar]
  21. Jovanović, A.; Perić, Z.; Nikolić, J. The effect of uniform data quantization on GMM–based clustering by means of EM algorithm. In Proceedings of the 2021 20th International Symposium INFOTEH–JAHORINA, Sarajevo, Bosnia and Herzegovina, 17–19 March 2021. [Google Scholar]
  22. Kasim, F.A.B.; Pheng, H.S.; Nordin, S.Z.B. Gaussian mixture modelexpectation maximization algorithm for brain images. In Proceedings of the 2021 2nd International Conference on Artificial Intelligence and Data Sciences (AiDAS), Kuala Lumpur, Malaysia, 8–9 September 2021. [Google Scholar]
  23. Stepashko, V. Asymptotic properties of a class of criteria for best model selection. In Proceedings of the 2020 IEEE 15th International Conference on Computer Sciences and Information Technologies, Zbarazh, Ukraine, 23–26 September 2020. [Google Scholar]
  24. Wei, J.; Zhou, L. Model selection using modified AIC and BIC in joint modeling of paired functional data. Stat. Probab. Lett. 2010, 80, 1918–1924. [Google Scholar] [CrossRef]
  25. Ding, J.; Tarokh, V.; Yang, Y. Bridging AIC and BIC: A new criterion for autoregression. IEEE Trans. Inf. Theory 2017, 64, 4024–4043. [Google Scholar] [CrossRef]
  26. Mo, X.; Wen, H.; Yang, Y. A parameter estimation method of α stable distribution and its application in the statistical modeling of ice–generated noise. Acta Acust. 2023, 48, 319–326. [Google Scholar]
Figure 1. A typical waveform of element–level received data from an active sonar.
Figure 1. A typical waveform of element–level received data from an active sonar.
Applsci 13 12063 g001
Figure 2. Statistical modeling method for ocean reverberation data based on GMM.
Figure 2. Statistical modeling method for ocean reverberation data based on GMM.
Applsci 13 12063 g002
Figure 3. The initialization of the parameters for GMM.
Figure 3. The initialization of the parameters for GMM.
Applsci 13 12063 g003
Figure 4. Waveform plot, probability density function (PDF) curve, and mean square error of non–Gaussian random sequences. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Figure 4. Waveform plot, probability density function (PDF) curve, and mean square error of non–Gaussian random sequences. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Applsci 13 12063 g004
Figure 5. Simulation of the reverberation data of an LFM signal, the PDF curve, and its mean square error plot. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Figure 5. Simulation of the reverberation data of an LFM signal, the PDF curve, and its mean square error plot. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Applsci 13 12063 g005
Figure 6. Simulation of the reverberation data of a CW signal, the PDF curve, and its mean square error plot. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Figure 6. Simulation of the reverberation data of a CW signal, the PDF curve, and its mean square error plot. (a) Waveform plot; (b) comparative PDF curves based on different models in graphical format; and (c) mean square error plots of the fitting results from different models.
Applsci 13 12063 g006aApplsci 13 12063 g006b
Figure 7. The reverberation data obtained in Experiment 1, along with probability density function (PDF) curves and their mean square errors. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Figure 7. The reverberation data obtained in Experiment 1, along with probability density function (PDF) curves and their mean square errors. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Applsci 13 12063 g007
Figure 8. The first section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Figure 8. The first section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Applsci 13 12063 g008aApplsci 13 12063 g008b
Figure 9. The second section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Figure 9. The second section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Applsci 13 12063 g009
Figure 10. The third section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Figure 10. The third section of reverberation data, PDF curves, and their mean square errors in Experiment 2. (a) Waveform plots; (b) comparative PDF curve comparisons based on different models; and (c) mean square error plots for the fitting results from different models.
Applsci 13 12063 g010
Table 1. Fitting results and statistical analysis of errors for non–Gaussian random sequences.
Table 1. Fitting results and statistical analysis of errors for non–Gaussian random sequences.
DistributionG–DSαS–DGM–D
Parameter μ , σ α , β , γ , μ λ k , μ k , σ k
Estimation0.0961.7101.5790.1491.0110.1410.6480.1852.015
0.352−0.0680.881
MSE2.2 × 10−42.4 × 10−51.8 × 10−5
Table 2. Parameter estimation results and error statistics of LFM signals.
Table 2. Parameter estimation results and error statistics of LFM signals.
DistributionG–DSαS–DGM–D
Parameter μ , σ α , β , γ , μ λ k , μ k , σ k
Estimation−0.1440.2711.4300.112−0.148−0.1230.385−0.1232.091
0.615−0.1580.267
MSE0.04200.00620.0009
Table 3. Parameter estimation results and error statistics of CW signals.
Table 3. Parameter estimation results and error statistics of CW signals.
DistributionG–DSαS–DGM–D
Parameter μ , σ α , β , γ , μ λ k , μ k , σ k
Estimation−0.1440.2711.4300.112−0.148−0.1230.2480.0040.013
0.6920.6920.037
0.0600.0310.051
MSE0.55460.05500.0139
Table 4. GMM fitting results with different K–values and their MSE.
Table 4. GMM fitting results with different K–values and their MSE.
DataFigure 6aFigure 7aFigure 8aFigure 9a
K H 1133
K H M S E 0.01070.01700.01890.0230
K A I C 6546
K A I C M S E 0.00150.00150.00130.0006
K B I C 6546
K B I C M S E 0.00310.00150.00130.0006
K B 3345
K H M S E 0.00310.00270.00130.0013
Table 5. Fitting results of different distributions to reverberation data.
Table 5. Fitting results of different distributions to reverberation data.
ParameterSαS–D
[ α , β , γ , μ ]
GM–D
[ λ k , μ k , σ k ]
Data
Figure 6a1.6270.1510.1330.0240.76700.023
0.2060.01380.015
0.027−0.0370.100
Figure 7b1.0770.0300.1120.0930.741−0.0870.149
0.254−0.0960.057
0.005−0.6560.253
Figure 8b1.606−0.0280.09−0.090.5140.0620.098
0.061−0.0620.281
0.3540.0770.253
0.0610.7820.096
Figure 9b1.5370.0100.2200.0530.2550.4200.196
0.0360.8470.068
0.032−0.7520.059
0.3720.0600.115
0.305−0.2720.221
Table 6. Error statistics for various estimated distribution parameters.
Table 6. Error statistics for various estimated distribution parameters.
DataFigure 6bFigure 7bFigure 8bFigure 9b
MSE
G–D0.01070.01700.15250.0245
SαS–D0.00370.00570.00880.0159
GM–D0.00310.00270.00130.0013
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, T.; Wen, Y.; Zhang, X.; Jia, B.; Zhou, M. Gaussian Mixture Model for Marine Reverberations. Appl. Sci. 2023, 13, 12063. https://doi.org/10.3390/app132112063

AMA Style

Sun T, Wen Y, Zhang X, Jia B, Zhou M. Gaussian Mixture Model for Marine Reverberations. Applied Sciences. 2023; 13(21):12063. https://doi.org/10.3390/app132112063

Chicago/Turabian Style

Sun, Tongjing, Yabin Wen, Xuegang Zhang, Bing Jia, and Mengwei Zhou. 2023. "Gaussian Mixture Model for Marine Reverberations" Applied Sciences 13, no. 21: 12063. https://doi.org/10.3390/app132112063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop