Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation

Alam, Faisal; Usman, Mohammed; Alkhammash, Hend I.; Wajid, Mohd

doi:10.3390/s21082692

Open AccessArticle

Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation

¹

Department of Computer Engineering, Z.H.C.E.T., Aligarh Muslim University, Aligarh 202002, India

²

Department of Electrical Engineering, King Khalid University, Abha 61411, Saudi Arabia

³

Department of Electrical Engineering, College of Engineering, Taif University, Taif 21944, Saudi Arabia

⁴

Department of Electronics Engineering, Z.H.C.E.T., Aligarh Muslim University, Aligarh 202002, India

^*

Author to whom correspondence should be addressed.

Sensors 2021, 21(8), 2692; https://doi.org/10.3390/s21082692

Submission received: 9 March 2021 / Revised: 5 April 2021 / Accepted: 9 April 2021 / Published: 11 April 2021

(This article belongs to the Section Electronic Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

The direction-of-arrival (DoA) estimation of an acoustic source can be estimated with a uniform linear array using classical techniques such as generalized cross-correlation, beamforming, subspace techniques, etc. However, these methods require a search in the angular space and also have a higher angular error at the end-fire. In this paper, we propose the use of regression techniques to improve the results of DoA estimation at all angles including the end-fire. The proposed methodology employs curve-fitting on the received multi-channel microphone signals, which when applied in tandem with support vector regression (SVR) provides a better estimation of DoA as compared to the conventional techniques and other polynomial regression techniques. A multilevel regression technique is also proposed, which further improves the estimation accuracy at the end-fire. This multilevel regression technique employs the use of linear regression over the results obtained from SVR. The techniques employed here yielded an overall

63 %

improvement over the classical generalized cross-correlation technique.

Keywords:

correlation coefficient; curve fitting; direction-of-arrival estimation; machine learning; microphone array; support vector regression

1. Introduction

Applications such as hands-free mobile communication, hearing aids, target tracking, surveillance of aerial targets, etc., require a close estimation of the direction-of-arrival of a sound source [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. Techniques based on microphone arrays and acoustic vector sensors (AVSs) can be employed for the accurate estimation of the DoA of the incoming acoustic wave [19,20,21,22,23,24,25,26,27]. However, the DoA estimation using the above techniques encounters practical challenges as the sound wave undergoes reflections and scattering from several objects and the surface enclosures in the surroundings. The irregular reflections of the sound waves create reverberations, and the presence of unwanted interfering sound sources causes a deterioration in the quality of the sound wave. Furthermore, ambient noise and sensor noise give rise to an additional disturbance in the acoustic wave. The estimation of the DoA can be carried out by acquiring signals impinging on a uniform linear array (ULA) of microphones/sensors. Different algorithms can be applied to the digital signals acquired on these microphones to compute the DoA. The classical known techniques/algorithms are beamforming, maximum-likelihood, the subspace method, time-difference of arrival (TDoA), etc. [28,29,30,31,32]. However, due to the presence of noise, interference and reverberations cause a higher deviation from the true value of the DoA, therefore being less suitable for many applications. In [33], the phase-mismatch error and gain-mismatch error among the sensors of the ULA were rectified using a compensated covariance matrix and phase retrieval for DOA estimation. In [34], a high-resolution, low-complexity method was presented with the use of unfolded coprime linear arrays, where the uniform property of the sub-arrays and the polynomial root finding method were used. A strategy was proposed to overcome the effect of sensor failure in a co-prime array for DoA estimation by employing the singular-value thresholding algorithm [35]. To overcome the grid-mismatch limitation, a solution was proposed in [36] that addressed the DOA estimation problem in an off-grid mode under a sparse framework. The authors in [37] reported a method for near-field and far-field localization with higher accuracy for underdetermined cases by exploiting the co-array property.

The advent of machine learning (ML) in the present era has opened up avenues for the exploration of different ML algorithms for DoA estimation [38]. In this paper, the acoustic digital signals acquired by the ULA were used to compute the feature, Pearson product moment correlation coefficients (PPMCCs). The PPMCCs paired with the known DoA were used to train the ML model for the prediction of the DoA. We carried out a comparative study for the performance of DoA estimation using multiple ML algorithms viz. linear regression, multivariate polynomial regression, and support vector regression (SVR), and compared its result with a classical technique based on the TDoA using generalized correlation coefficients (GCCs). After the assessment of the best ML model for the DoA estimation, we further proposed a curve-fitting-based pre-processing technique for improving the DoA estimation. Furthermore, a multilevel regression scheme was proposed for reducing the error in the DoA estimate at the end-fire.

The rest of the paper is organized as follows. In Section 2, the signal model is explained. Section 3 gives a brief discussion of the techniques used. In Section 4, the methodologies used are explained, and the progressive improvement with comparative results is presented in each subsection. Section 5 concludes the paper.

2. Signal Model

The incoming acoustic waves moving with speed c were assumed to be from the source placed in the far-field; therefore, the incident wave-front was planar. The received signals were assumed to be a narrowband signal,

s (t)

, with the center frequency F (where

F = c / λ

and

λ

is the wavelength). It was assumed that the sound source and the ULA were in the same plane. The DoA with respect to the normal of the ULA is denoted by

θ

. There were M number of microphones in a ULA, and each microphone was assumed to be of a point size and to have an omnidirectional pattern. The adjacent microphones in the ULA were separated by a distance d, as shown in Figure 1.

The signal received by the mth (m = 1 to M) sensor of the array was a phase shifted signal in the frequency domain, which can be formulated as:

x_{m} (t) = s (t) e^{- j 2 \frac{π}{λ} D_{m}} + n_{m} (t)

(1)

where

D_{m}

is the wave path difference between different microphones in the ULA, which is expressed as

D_{m} = (m - 1) d sin θ

, and

n_{m} (t)

is the noise added to the mth sensor. We can rewrite Equation (1) as:

x (t) = {[x_{1} (t), x_{2} (t), \dots . . x_{m} (t)]}^{T} = a (θ) s (t) + n (t) .

(2)

where

a (θ)

is the steering vector of the ULA,

n (t)

is the noise vector, and

{[.]}^{T}

denotes the transpose. The

M \times M

correlation matrix

R_{xx}

of received signal vector

x (t)

is expressed as:

R_{xx} = E [x (t) x^{H} (t)] = a (θ) S a^{H} (θ) + R_{n},

(3)

where

E [.]

and

{[.]}^{H}

denote the ensemble average and conjugate transpose, respectively. The signal and noise correlation matrices

S

and

R_{n}

can be expressed as:

S = E [s (t) s^{H} (t)]

(4)

and

R_{n} = E [n (t) n^{H} (t)],

(5)

respectively. We assumed that all the noise components were zero mean, mutually uncorrelated, and had the same power. Thus, we have:

R_{n} = σ^{2} I,

(6)

where

I

is the identity matrix and

σ^{2}

is the noise power. Then, Equation (3) can be written as:

R_{xx} = a (θ) S a^{H} (θ) + σ^{2} I .

(7)

3. Brief Discussion of the Techniques Used

In this paper, the process of DoA estimation employed several techniques, which when applied in tandem helped the near-accurate estimation of the DoA. A brief discussion of these techniques is given in the following subsections.

3.1. Polynomial Regression

Linear regression is a technique that helps to find a linear relationship between predictors x and the response y. On the other hand, polynomial regression is a technique that helps identify a non-linear relationship between predictors and the response. As explained in [39], the degree of polynomial regression has to be predetermined before the training. Based on the degree of the polynomial, n, a parameterized equation is of the form:

y = bx + ϵ,

(8)

where

b = [b_{1}, b_{2} . . . b_{n}]

is the parameter vector to be estimated for the best fit and

x = {[x, x^{2}, x^{3}, . . ., x^{n}]}^{T}

. The non-linear terms like

x^{2}

,

x^{3}

, …, are considered to be derived dimensions based on the base dimension x. The polynomial regression involves performing multivariate linear regression considering higher order terms to be a separate dimension. A univariate regression is a regression involving a single predictor variable, whereas a multivariate regression involves multiple predictor variables.

3.2. Support Vector Regression

Support vector regression (SVR) is a technique that finds a non-linear mathematical relationship between predictors and the response where the prior information of the polynomial degree is not required [40,41]. The technique involves projecting the predictor space into a multidimensional space using a kernel function. In this work, the radial basis function (RBF) was used as the kernel function. The RBF is given as:

K (x, x^{'}) = e x p (- γ | x - x^{'} |),

(9)

where

x

and

x^{'}

are the two vectors in the feature space. This kernel function expands into multidimensional terms. The error estimation is measured using the following loss function,

L = max \{\begin{matrix} 0, & | y - F (x, \hat{w}) | < ϵ \\ | y - F (x, \hat{w}) | - ϵ, & otherwise . \end{matrix}

(10)

As mentioned in [40],

F (x, w)

is a family of functions parameterized by w,

\hat{w}

is that value of

w

that minimizes a measure of the error between y and F(x, w). This loss function is termed as the

ϵ

-insensitive loss function. It identifies a high-dimensional tube of diameter

ϵ

. If the estimate is within the tube, then the loss is zero; otherwise, the loss is the distance from the point of estimation to the closest tube periphery. The objective is to flatten this tube to the maximum extent possible. In the training part, linear regression was performed on this high-dimensional space. As a result of the regression process, a linear hyperplane was identified that reduced the overall loss.

The roots of the SVR method are the same as the popular support vector machine (SVM) method, which is used in classification problems and utilizes the same underlying theory. SVM cannot be directly applied here as the DoA estimation was modeled in this work as a regression problem rather than a classification problem.

3.3. Pearson Product Moment Correlation Coefficient

As explained in [42], the Pearson product moment correlation coefficient (PPMCC) quantifies the degree of association between two statistical variables. It also ascertains whether the variables are directly or inversely associated with each other. The PPMCC ranges between +1 and −1. The value +1 indicates the highest degree of association between the variables, which indicates that an increase in one variable is commensurate with an increase in the other variables. The PPMCC value of −1 indicates the highest degree of dissociation between the two variables where an increase in one variable is commensurate with a decrease in another variable. The values in between intervals (−1, +1) are indicative of the corresponding association between the variables proportional to the measure of their values. If L is the sample size,

{p_{i}}_{i = 1}^{L}

and

{q_{i}}_{i = 1}^{L}

are two variables,

\bar{p}

is the mean of

{p_{i}}_{i = 1}^{L}

, and

\bar{q}

is the mean of

{q_{i}}_{i = 1}^{L}

, then the PPMCC,

r_{p q}

, is given by:

r_{p q} = \frac{\sum_{i = 1}^{L} (p_{i} - \bar{p}) (q_{i} - \bar{q})}{\sqrt{\sum_{i = 1}^{L} {(p_{i} - \bar{p})}^{2}} \sqrt{\sum_{i = 1}^{L} {(q_{i} - \bar{q})}^{2}}} .

(11)

3.4. Curve Fitting

Curve fitting is the process of identifying a curve that wraps around a series of data points in the best possible manner. The identification of the mathematical model or the curve starts with the proposition of a parameterized mathematical model. The objective of curve fitting identifies the value of these parameters such that the mathematical model minimizes the overall fitting error. For the best fit, the objective function has to be minimized with respect to the parameters,

p

, using the Levenberg–Marquardt (LM) algorithm explained in [43,44]. The LM algorithm internally uses a combination of two different methods viz. the Gradient Descent Method (GDM) and the Gauss–Newton Method (GNM). In the LM algorithm, the GDM dominates when the parameters are far apart from their actual values; however, the GNM dominates when the parameters are nearby. The goodness-of-fit is measured using the chi-squared error as given below:

χ^{2} (p) = \sum_{i = 1}^{m} {[\frac{g (t_{i}) - \hat{g} (t_{i}; p)}{{σ_{g}}_{i}}]}^{2}

(12)

where

{σ_{g}}_{i}

is the average error,

\hat{g} (t; p)

is the fitted function of independent variable t and a vector of n parameters p, and m is the number of data points in the data set.

The combination of the GDM and GNM is accomplished using a parameter

λ

that is tuned to fall into the appropriate method based on its magnitude,

[{\frac{\partial \hat{g}}{\partial P}}^{T} Θ \frac{\partial \hat{g}}{\partial P} + λ I] h_{m} = {\frac{\partial \hat{g}}{\partial p}}^{T} Θ (g - \hat{g}),

(13)

where

Θ

is a diagonal matrix with elements

w_{i i} = 1 / σ_{g_{i}}^{2}

and

h_{m}

is the perturbation parameter that reduces the chi-squared error. To reach closer to the global minimum, the first few steps are taken to be small in the steepest direction. This is accomplished by keeping the value of

λ

small. The equation behaves closer to the Gauss–Newton update with a small

λ

. If an iteration gives a high error, then

λ

is increased and the equation behaves like gradient descent. The scipy package in Python provides the function curve_fit with its optimized class that provides the curve fitting function using multiple algorithms. In this work, we used this function for the LM algorithm. Since the acoustic waves received at the microphone array were sinusoidal with additive white Gaussian noise (AWGN), to fit the curve, sinusoidal function

a s i n (b x + ϕ)

was been used where a, b, and

ϕ

are the parameters to be tuned to fit the curve.

4. Methodologies and Results

The ULA of omnidirectional microphones was used to record the spatial signals for developing the machine learning model for DOA estimation. The data set consisted of multiple recordings of microphones of a duration of 25 ms each, which were acquired for angles

0^{\circ}

,

2^{\circ}

,

4^{\circ}

,

6^{\circ}

, …,

90^{\circ}

of a sound source with different signal-to-noise ratio (SNR) values. For each such angle with sensor noise of SNR = 26 dB, one-thousand four-hundred independent realizations of 25 ms in duration were used for training. For testing, a new data set was created with angles

0^{\circ}

to

90^{\circ}

with an increment of

1^{\circ}

for SNR = 22 dB, 18 dB, 14 dB, and 10 dB. For the recorded signals at each angle, we took the PPMCC between the discrete signals from the microphones. Discrete signals from the 1st, 2nd, 3rd, and 4th microphone produced six pairs of correlations. These signals were further processed to estimate the DoA. The following subsections explain the different methodologies that were applied for DoA estimation and exhibit the result obtained with these methods. Since various techniques and methodologies were assessed for improving the DoA estimation, each succeeding subsection employs a technique that improves over the best obtained in the preceding subsection. The result is obtained in terms of root mean squared angular error (

R M S A E

) and

\bar{R M S A E}

and in each subsection, and it is compared with the results in the preceding subsections. The

R M S A E

and

\bar{R M S A E}

are defined in Equations (14) and (15), respectively,

R M S A E (θ) = \sqrt{\frac{\sum_{i = 1}^{N} {(θ - θ_{i})}^{2}}{N}}

(14)

and

\bar{R M S A E} = \frac{1}{T O A} \sum_{θ = 0^{\circ}}^{90^{\circ}} R M S A E (θ)

(15)

where

θ_{i}

is the ith prediction of the true angle

θ

, N is the total number of predictions realized, and the TOA is the total number of angles observed.

4.1. Regression Techniques with PPMCC

The PPMCCs were calculated between different signals

s_{1} (t)

,

s_{2} (t)

,

s_{3} (t)

, and

s_{4} (t)

recorded by four microphones

m_{1}

,

m_{2}

,

m_{3}

, and

m_{4}

, respectively. The PPMCC between each microphone signal produced six coefficients,

c_{12}

,

c_{13}

,

c_{14}

,

c_{23}

,

c_{24}

, and

c_{34}

. These PPMCCs were used for training on the data set to produce the mathematical model of the DoA estimate. We trained the signals on the data set with SNR = 26 dB. The feature set was composed of the correlation coefficients mentioned above, and the response was the actual DoA. The regression techniques that were used and analyzed in this experiment were SVR, polynomial regression of order 1 (PR1), polynomial regression of order 2 (PR2), polynomial regression of order 3 (PR3), polynomial regression of order 4 (PR4), polynomial regression of order 5 (PR5), polynomial regression of order 6 (PR6), and polynomial regression of order 7 (PR7). In addition to these regression techniques, we also estimated the DoA for comparison with the conventional generalized cross correlation (GCC) technique.

Figure 2 shows the comparative assessment of SVR, PR1, PR2, and GCC. Polynomial regression of orders higher than two is not shown in this figure as their

R M S A E

values were too high and difficult to show. In the following section, we propose a mechanism to reduce their

R M S A E

values. Figure 2 also reveals that the SVR gave a low

R M S A E

for higher SNR values, i.e., 22 dB and 18 dB. However, at lower SNR values of 14 dB and 10 dB, the GCC was robust. Moreover, the GCC provided an

R M S A E

closer to SVR even on higher SNR values. In short, the GCC was a better estimator as it consistently gave good approximation at low, as well as high SNR values (the next section improves the performance of ML algorithms). Among the regression techniques, SVR performed better than PR1, which in turn performed better than PR2. The

\bar{R M S A E}

of all the regression techniques and GCC are shown in Table 1. Another observation was that the regression techniques had a higher

R M S A E

at the broadside (ranging between

0^{\circ}

to

5^{\circ}

), as well as at the end-fire (ranging from

80^{\circ}

to

90^{\circ}

). The cause for higher

\bar{R M S A E}

near the end-fire was identified as follows. Consider the data set with SNR = 26 dB: the PPMCCs were computed. Then, we took the ensemble average of PPMCCs (

E [c_{12}]

,

E [c_{13}]

,

E [c_{14}]

,

E [c_{23}]

,

E [c_{24}]

, and

E [c_{34}]

) for each DoA, which is shown in Figure 3. This revealed that the correlations at the end-fire were steady with almost the same values, which was due to a smaller change in the relative time-delay with respect to the DoA. The relative time-delay,

τ_{i, i + 1}

, between the signals of two adjacent microphones (

m_{i} (t)

and

m_{i + 1} (t)

) for a planar wave-front are approximated by:

τ_{i, i + 1} \approx \frac{d}{c} s i n (θ),

(16)

where d is the microphone separation, c is the speed of sound, and

θ

is the direction of arrival of a planar wave with respect to the axis normal to the ULA. The rate of change of relative delay with respect to

θ

is given by:

\frac{d τ_{i, i + 1}}{d θ} \approx \frac{d}{c} c o s (θ),

(17)

which shows that at

θ \approx 90^{o}

, the

\frac{d τ}{d θ}

is small, therefore the rate of change of the PPMCC is also small. This steady value of

τ_{i, i + 1}

caused the same values of all ensemble PPMCC over a span of DoAs, thereby causing the regression techniques to have an error in the estimation of the DoA near end-fire. For the no noise case, the signal at all microphones would have similar waveforms, which would indicate that the PPMCC should be unity; however, random noise at the microphones caused the random variation of the PPMCC, and thereby, the learning/training of the machine was poor from the data at the broadside, hence the higher

\bar{R M S A E}

.

4.2. Improvement with Curve Fitting

To improve on the DoA estimation using regression techniques, curve fitting was applied on the noisy sinusoidal signal recorded by the microphones, which was described in Section 3.4. The process of curve fitting reduced the noise in the recorded signals, thereby providing a sanitized input for improved results. After pre-processing with curve fitting on the recorded noisy signals, the regression techniques mentioned in Section 4 were applied. The

R M S A E

versus DoA results with the SNR ranging from 22 to 10 dB with a decrement of 4 dB are shown in Figure 4. It also shows the results obtained from the GCC for comparison. A closer look at the results reveals that pre-processing with curve fitting reduced the

R M S A E

of the DoA estimates with regression techniques. Amongst all the regression techniques, SVR outperformed then in terms of the

R M S A E

. Polynomial regression of higher order yielded poor results of DoA estimation without curve fitting with a too high

R M S A E

. It can be seen that after pre-processing with curve fitting, the

R M S A E

of the DoA estimate was significantly reduced. The GCC continued to perform better with a consistently lower

R M S A E

, but it had a highly rugged curve with respect to the true DoA. Figure 5 shows the performance comparison of SVR and the GCC with and without curve fitting. It was observed that SVR with curve fitting had a lower

R M S A E

. A comparison in terms of

\bar{R M S A E}

of different regression techniques and the GCC with and without curve fitting is shown in Table 1. It can be seen from this table that the

\bar{R M S A E}

of SVR with curve fitting had the lowest values for each experiment with particular SNR values of 22 dB, 18 dB, 14 dB, and 10 dB. In contrast to the GCC, as the SNR value decreased, the curve fitting showed more improvement in the performance of SVR.

4.3. DoA Estimation Improvement at the End-Fire

As mentioned in Section 4.1, it can be observed from all the preceding results that the

R M S A E

error was consistently high at the end-fire ranging from

80^{\circ}

and above. To mitigate this high error, the bias and standard deviation analysis of the DoA estimates were performed. Based on this analysis, we proposed a multilevel regression to reduce the bias and further improve the accuracy at the end-fire.

4.3.1. Analysis of the DoA Estimation: Bias and Standard Deviation

The bias of a DoA estimator is the difference between the estimated DoA and the true DoA. If the bias of a DoA estimator is a certain value, then the estimation can be improved by subtracting the bias from the estimated value. The standard deviation of the DoA estimate indicates how much an estimated DoA differs from its expected value, which is a random error and cannot be compensated for a given estimate. Figure 6 and Figure 7 show the bias and standard deviation of the DoA estimation for SVR and the GCC with curve fitting. It can be observed from Figure 6 and Figure 7 that the bias for SVR was stable, whereas the bias for the GCC fluctuated with respect to true DoA. This stable bias in the SVR was beneficial as it aided in bias error compensation. The standard deviation of the DoA estimate for SVR was relatively more stable than the GCC. In contrast to SVR, the standard deviation of the DoA estimate for the GCC showed spikes. Therefore, it can be concluded that SVR with curve fitting provided the best DoA estimate amongst all techniques considered.

4.3.2. Improved DoA Estimation with Multilevel Regression

It can be observed from Figure 4 that despite curve fitting, the DoA estimation error in the range between

80^{\circ}

and

90^{\circ}

(end-fire) was relatively higher as compared to the DoA ranging between

0^{\circ}

and

80^{\circ}

. Even after applying curve fitting, there was no substantial improvement in DoA estimation at the end-fire. Figure 6 reveals that the SVR bias curve for DoA estimation was increasing near the end-fire. To remove this bias, a second-level regression model (SLRM) was applied at the end-fire. The input to the SLRM was the DoA estimated from the SVR with curve fitting (SVR-CF) model, and the output was expected to compensate for this bias. The SLRM was trained with linear regression (LR) for angles estimated to be between

80^{\circ}

and

90^{\circ}

at SNR = 26 dB. The response from this second-level regression for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB is shown in Figure 8. It was observed that the

R M S A E

was reduced significantly at the end-fire with the maximum

R M S A E

reducing from

6^{\circ}

to

2^{\circ}

by this improvement. Table 2 compares the results of the SVR-CF model and the SVR-CF tandem with LR (SVR-CF-LR) model in terms of

\bar{R M S A E}

. It can be seen that the

\bar{R M S A E}

values significantly reduced, indicating the SLRM further improved the DoA estimation.

5. Conclusions

DoA estimation of a sound source has many applications such as unmanned aerial vehicles, hearing aids, surveillance, etc. Many techniques such as beamforming and the GCC, employing a uniform linear array of microphones, have been studied in the past. However, these techniques have not been as effective and produced errors in the estimated angle owing to the impact of noise that blends with the received signals. This work aimed at increasing the efficacy of DoA estimation by exploring multiple machine learning techniques. The models for polynomial regression of order one to order seven and SVR were trained with the PPMCC as the feature selected, then compared with the classical GCC technique for comparative analysis. Since the results were not very impressive, we explored pre-processing techniques on the incoming signals for improved results. Curve-fitting as the pre-processing technique was applied, and the same regression models PR 1-7 and SVR were trained and tested after pre-processing and compared with each other. Among the regression techniques used, SVR fared the best with a minimum error when compared with the PR techniques and the GCC. However, all the techniques were shown to yield high errors near the end-fire. To reduce the error at the end-fire, multilevel regression was applied with linear regression as the second-level regression on the results of SVR. This technique proved to be nearly accurate, stable, and unbiased and produced approximately a

63 %

improvement in the estimated angle when compared with the classical GCC technique.

Author Contributions

Conceptualization, F.A. and M.W.; Methodology, F.A. and M.W.; Software, F.A. and M.W.; Validation, M.U. and H.I.A.; Formal Analysis, F.A., M.U., H.I.A. and M.W.; Investigation, F.A. and M.W.; Resources, M.U. and H.I.A.; Writing – Original Draft Preparation, F.A. and M.W.; Writing – Review & Editing, F.A., M.U., H.I.A. and M.W.; Visualization, F.A. and M.W.; Supervision, M.W.; Project Administration, M.U.; Funding Acquisition, H.I.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to acknowledge the support from Taif University Researchers Supporting Project Number (TURSP-2020/264), Taif University, Taif, Saudi Arabia.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DoA	Direction of arrival
SVR	Support vector regression
AVS	Acoustic vector sensor
ULA	Uniform linear array
TDoA	Time-difference of arrival
ML	Machine learning
PPMCC	Pearson product moment correlation coefficient
GCC	Generalized correlation coefficient
SVM	Scalable vector machine
LM	Levenberg–Marquardt
RBF	Radial basis function
GDM	Gradient descent method
GNM	Gauss–Newton method
SNR	Signal-to-noise ratio
AWGN	Additive white Gaussian noise
PRn	Polynomial regression of order n
SLRM	Second-level regression model
SVR-CF	SVR with curve fitting
SVR-CF-LR	SVR-CF in tandem with linear regression

References

Zheng, X.; Ritz, C.; Xi, J. Encoding and communicating navigable speech soundfields. Multimed. Tools Appl. 2016, 75, 5183–5204. [Google Scholar] [CrossRef] [Green Version]
Asaei, A.; Taghizadeh, M.J.; Haghighatshoar, S.; Raj, B.; Bourlard, H.; Cevher, V. Binary sparse coding of convolutive mixtures for sound localization and separation via spatialization. IEEE Trans. Signal Process. 2016, 64, 567–579. [Google Scholar] [CrossRef] [Green Version]
Bekkerman, I.; Tabrikian, J. Target detection and localization using mimo radars and sonars. IEEE Trans. Signal Process. 2006, 54, 3873–3883. [Google Scholar] [CrossRef]
Wong, K.T.; Zoltowski, M.D. Closed-form underwater acoustic direction-finding with arbitrarily spaced vector hydrophones at unknown locations. IEEE J. Ocean. Eng. 1997, 22, 566–575. [Google Scholar] [CrossRef]
Sheng, X.; Hu, Y.-H. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Trans. Signal Process. 2005, 53, 44–53. [Google Scholar] [CrossRef] [Green Version]
Zhao, S.; Ahmed, S.; Liang, Y.; Rupnow, K.; Chen, D.; Jones, D.L. A real-time 3d sound localization system with miniature microphone array for virtual reality. In Proceedings of the 2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA), Singapore, 18–20 July 2012; pp. 1853–1857. [Google Scholar]
Clark, J.A.; Tarasek, G. Localization of radiating sources along the hull of a submarine using a vector sensor array. In Proceedings of the OCEANS 2006, Boston, MA, USA, 18–21 September 2006; pp. 1–3. [Google Scholar]
Carpenter, R.N.; Cray, B.A.; Levine, E.R. Broadband ocean acoustic (boa) laboratory in narragansett bay: Preliminary in situ harbor security measurements. In Defense and Security Symposium; International Society for Optics and Photonics: Bellingham, WA, USA, 2006; p. 620409. [Google Scholar]
DiBiase, J.H.; Silverman, H.F.; Brandstein, M.S. Robust localization in reverberant rooms. In Microphone Arrays; Springer: Berlin, Germany, 2001; pp. 157–180. [Google Scholar]
Bechler, D.; Schlosser, M.S.; Kroschel, K. System for robust 3d speaker tracking using microphone array measurements. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), Sendai, Japan, 28 September–2 October 2004. [Google Scholar]
Argentieri, S.; Danes, P. Broadband variations of the music high-resolution method for sound source localization in robotics. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; pp. 2009–2014. [Google Scholar]
Nakadai, K.; Matsuura, D.; Okuno, H.G.; Kitano, H. Applying scattering theory to robot audition system: Robust sound source localization and extraction. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), Las Vegas, NV, USA, 27–31 October 2003; pp. 1147–1152. [Google Scholar]
Zhao, S.; Chng, E.S.; Hieu, N.T.; Li, H. A robust real-time sound source localization system for olivia robot. In Proceedings of the 2010 APSIPA Annual Summit and Conference, Biopolis, Singapore, 14–17 December 2010. [Google Scholar]
Xiao, X.; Zhao, S.; Nguyen, D.H.H.; Zhong, X.; Jones, D.L.; Chng, E.-S.; Li, H. The ntu-adsc systems for reverberation challenge 2014. In Proc. REVERB Challenge Workshop; Spoken Language Systems MIT Computer Science and Artificial Intelligence Laboratory: Cambridge, MA, USA, 2014. [Google Scholar]
Delikaris-Manias, S.; Vilkamo, J.; Pulkki, V. Signal-dependent spatial filtering based on weighted-orthogonal beamformers in the spherical harmonic domain. IEEE/ACM Trans. Audio Speech Lang. Process. 2016, 24, 1507–1519. [Google Scholar] [CrossRef]
Delikaris-Manias, S.; Pulkki, V. Cross pattern coherence algorithm for spatial filtering applications utilizing microphone arrays. IEEE Trans. Audio Speech Lang. Process. 2013, 21, 2356–2367. [Google Scholar] [CrossRef]
Zhang, C.; Florêncio, D.; Ba, D.E.; Zhang, Z. Maximum likelihood sound source localization and beamforming for directional microphone arrays in distributed meetings. IEEE Trans. Multimed. 2008, 10, 538–548. [Google Scholar] [CrossRef]
Van den Bogaert, T.; Carette, E.; Wouters, J. Sound source localization using hearing aids with microphones placed behind-the-ear, in-the-canal, and in-the-pinna. Int. J. Audiol. 2011, 50, 164–176. [Google Scholar] [CrossRef]
Wajid, M.; Alam, F.; Yadav, S.; Khan, M.A.; Usman, M. Support vector regression based direction of arrival estimation of an acoustic source. In Proceedings of the 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies (3ICT), Sakheer, Bahrain, 20–21 December 2020; pp. 1–6. [Google Scholar]
Wajid, M.; Kumar, A.; Bahl, R. Design and analysis of air acoustic vector-sensor configurations for two-dimensional geometry. J. Acoust. Soc. Am. 2016, 139, 2815–2832. [Google Scholar] [CrossRef]
Wajid, M.; Kumar, A.; Bahl, R. Bearing estimation in a noisy and reverberant environment using an air acoustic vector sensor. IUP J. Electr. Electron. Eng. 2016, 9, 53. [Google Scholar]
Wajid, M.; Kumar, A.; Bahl, R. Direction-finding accuracy of an air acoustic vector sensor in correlated noise field. In Proceedings of the 2017 4th International Conference on Signal Processing, Computing and Control (ISPCC), Solan, India, 21–23 September 2017; pp. 21–25. [Google Scholar]
Wajid, M.; Kumar, A.; Bahl, R. Direction-of-arrival estimation algorithms using single acoustic vector-sensor. In Proceedings of the 2017 International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), Aligarh, India, 24–26 November 2017; pp. 84–88. [Google Scholar]
Wajid, M.; Kumar, A. Direction estimation and tracking of coherent sources using a single acoustic vector sensor. Arch. Acoust. 2020, 45, 209–219. [Google Scholar]
Yadav, S.; Wajid, M.; Usman, M. Support vector machine-based direction of arrival estimation with uniform linear array. In Advances in Computational Intelligence Techniques; Springer: Singapore, Singapore, 2020; pp. 253–264. [Google Scholar]
Wajid, M.; Kumar, B.; Goel, A.; Kumar, A.; Bahl, R. Direction of arrival estimation with uniform linear array based on recurrent neural network. In Proceedings of the 2019 5th International Conference on Signal Processing, Computing and Control (ISPCC), Solan, India, 10–12 October 2019; pp. 361–365. [Google Scholar]
Zhou, C.; Gu, Y.; Zhang, Y.D.; Shi, Z.; Jin, T.; Wu, X. Compressive sensing-based coprime array direction-of-arrival estimation. IET Commun. 2017, 11, 1719–1724. [Google Scholar] [CrossRef]
Liu, A.; Yang, D.; Shi, S.; Zhu, Z.; Li, Y. Augmented subspace music method for doa estimation using acoustic vector sensor array. IET Radar Sonar Navig. 2019, 13, 969–975. [Google Scholar] [CrossRef]
Shi, F. Two dimensional direction-of-arrival estimation using compressive measurements. IEEE Access 2019, 7, 20863–20868. [Google Scholar] [CrossRef]
Cui, X.; Yu, K.; Zhang, S.; Wang, H. Azimuth-only estimation for tdoa-based direction finding with 3-d acoustic array. IEEE Trans. Instrum. Meas. 2019, 69, 985–994. [Google Scholar] [CrossRef]
Meng, Z.; Zhou, W. Direction-of-arrival estimation in coprime array using the esprit-based method. Sensors 2019, 19, 707. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Varma, K.M. Time Delay Estimate Based Direction of Arrival Estimation for Speech in Reverberant Environments. Ph.D. Thesis, Virginia Tech, Blacksburg, VG, USA, 2002. [Google Scholar]
Zhang, L.; Wu, S.; Guo, A.; Yang, W. A novel direction-of-arrival estimation via phase retrieval with unknown sensor gain-and-phase errors. Sensors 2019, 19, 2701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
He, W.; Yang, X.; Wang, Y. A high-resolution and low-complexity doa estimation method with unfolded coprime linear arrays. Sensors 2020, 20, 218. [Google Scholar] [CrossRef] [Green Version]
Sun, B.; Wu, C.; Ruan, H. Array diagnosis and doa estimation for coprime array under sensor failures. Sensors 2020, 20, 2735. [Google Scholar] [CrossRef]
Wu, X.; Zhu, W.P.; Yan, J.; Zhang, Z. Two sparse-based methods for off-grid direction-of-arrival estimation. Signal Process. 2018, 142, 87–95. [Google Scholar] [CrossRef]
Wu, X. Localization of far-field and near-field signals with mixed sparse approach: A generalized symmetric arrays perspective. Signal Process. 2020, 175, 107665. [Google Scholar] [CrossRef]
Liu, Y.; Chen, H.; Wang, B. Doa estimation based on cnn for underwater acoustic array. Appl. Acoust. 2021, 172, 107594. [Google Scholar] [CrossRef]
Maia, A.; Ferreira, E.; Oliveira, M.C.; Menezes, L.F.; Andrade-Campos, A. 3-numerical optimization strategies for springback compensation in sheet metal forming. In Computational Methods and Production Engineering; Davim, J., Ed.; Woodhead Publishing Reviews: Mechanical Engineering Series; Woodhead Publishing: Cambridge, UK, 2017; pp. 51–82. [Google Scholar]
Drucker, H.; Burges, C.J.C.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. In Advances in Neural Information Processing Systems; Mozer, M.C., Jordan, M., Petsche, T., Eds.; MIT Press: Cambridge, MA, USA, 1997; Volume 9, pp. 155–161. [Google Scholar]
Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef] [Green Version]
King, A.P.; Eckersley, R.J. Chapter 2 - Descriptive Statistics II: Bivariate and Multivariate Statistics. In Statistics for Biomedical Engineers and Scientists; King, A.P., Eckersley, R.J., Eds.; Academic Press (Imprint of Elsevier): London, UK, 2019; pp. 23–56. [Google Scholar]
Gavin, H.P. The Levenberg-Marquardt Algorithm Fornonlinear Least Squares Curve-Fitting Problems. J. Dept. Civ. Environ. Eng. Duke Univ. 2011, 28, 1–5. [Google Scholar]
Moré, J.J. The levenberg-marquardt algorithm: Implementation and theory. In Numerical Analysis; Watson, G.A., Ed.; Springer: Berlin/Heidelberg, Germany, 1978; pp. 105–116. [Google Scholar]

Figure 1. Signal impinging on a uniform linear array, where filled triangles indicate the position of omni-directional microphones.

Figure 2. Comparison of different regression techniques (SVR, PR1, and PR2) and the GCC for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 3. Ensemble average of the PPMCC versus the direction of arrival at SNR = 26 dB.

Figure 4. Comparison of different regression techniques (SVR, PR1, and PR2) and the GCC with curve fitting for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 5. Comparison of SVR and the GCC with and without curve fitting for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 6. Bias comparison of SVR with curve fitting and the GCC with curve fitting for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 7. Comparison of the standard deviation of SVR with curve fitting (SVR-CF) and the GCC with curve fitting for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 8. Comparison between SVR-CF and SVR-CF-LR for the DoA estimate in terms of the

R M S A E

at the end-fire for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Figure 8. Comparison between SVR-CF and SVR-CF-LR for the DoA estimate in terms of the

R M S A E

at the end-fire for (a) SNR = 22 dB, (b) SNR = 18 dB, (c) SNR = 14 dB, and (d) SNR = 10 dB.

Table 1.

\bar{R M S A E}

(degrees) values with multiple regressions and conventional techniques.

Table 1.

\bar{R M S A E}

(degrees) values with multiple regressions and conventional techniques.

Base Technique	SNR (dB)	$\bar{RMSAE}$ (Degrees)
Base Technique	SNR (dB)	Without Curve Fitting	With Curve Fitting
SVR	22	0.428	0.41
	18	0.551	0.429
	14	0.966	0.464
	10	2.039	0.53
PR1	22	1.212	1.194
	18	1.418	1.217
	14	2.587	1.264
	10	5.567	1.363
PR2	22	4.23	0.655
	18	10.351	0.687
	14	25.232	0.772
	10	59.379	1.034
PR3	22	22.086	0.326
	18	Too High	0.398
	14	Too High	0.948
	10	Too High	4.811
PR4	22	Too High	0.286
	18	Too High	0.983
	14	Too High	7.95
	10	Too High	73.604
PR5	22	Too High	0.27
	18	Too High	1.932
	14	Too High	19.671
	10	Too High	High
PR6	22	Too High	0.275
	18	Too High	2.183
	14	Too High	21.895
	10	Too High	High
PR7	22	Too High	0.326
	18	Too High	2.967
	14	Too High	31.947
	10	Too High	High
GCC	22	0.983	0.983
	18	0.988	0.988
	14	0.998	0.998
	10	1.027	1.027

Table 2.

\bar{R M S A E}

of the DoA estimate for SVR-CF and SVR-CF-LR at the end-fire.

Table 2.

\bar{R M S A E}

of the DoA estimate for SVR-CF and SVR-CF-LR at the end-fire.

Technique	SNR (dB)	$\bar{RMSAE}$ (Degrees)
SVR-CF	22	0.410
	18	0.429
	14	0.464
	10	0.530
SVR-CF-LR	22	0.294
	18	0.322
	14	0.375
	10	0.473

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alam, F.; Usman, M.; Alkhammash, H.I.; Wajid, M. Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation. Sensors 2021, 21, 2692. https://doi.org/10.3390/s21082692

AMA Style

Alam F, Usman M, Alkhammash HI, Wajid M. Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation. Sensors. 2021; 21(8):2692. https://doi.org/10.3390/s21082692

Chicago/Turabian Style

Alam, Faisal, Mohammed Usman, Hend I. Alkhammash, and Mohd Wajid. 2021. "Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation" Sensors 21, no. 8: 2692. https://doi.org/10.3390/s21082692

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improved Direction-of-Arrival Estimation of an Acoustic Source Using Support Vector Regression and Signal Correlation

Abstract

1. Introduction

2. Signal Model

3. Brief Discussion of the Techniques Used

3.1. Polynomial Regression

3.2. Support Vector Regression

3.3. Pearson Product Moment Correlation Coefficient

3.4. Curve Fitting

4. Methodologies and Results

4.1. Regression Techniques with PPMCC

4.2. Improvement with Curve Fitting

4.3. DoA Estimation Improvement at the End-Fire

4.3.1. Analysis of the DoA Estimation: Bias and Standard Deviation

4.3.2. Improved DoA Estimation with Multilevel Regression

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI