The TDoA model specifies that the
and
values represent the number of samples that will be formed until the sound reaches the
i. and the reference microphones, respectively, after it is formed.
is the amount of error that the GCC-Phat algorithm will produce due to environmental noise added to the source sound in real-time applications. The time
is the unmeasured time at which the sound arrives at the microphones, which, depending on the sampling frequency, is less than the sampling time
.
and
are the unmeasured times at which the sound reaches microphone i and the reference microphone, respectively. In our simulation study, the sound source, which is assumed to be 150 m away, was examined at five different angles of 0°, −30°, −45°, −60°, and −90°. The distance
between the microphones was increased from 0.5 m to 3 m in 0.5 m steps. In the simulation study, the sampling frequency
was set to 44.1 kHz. In
Figure 5, the LSE solutions obtained by solving Equation (15) for each
distance are shown as “*”, and the actual source position is shown as “o”. It can be observed that the location estimates approach the actual location value as the microphone spacing is increased.
3.1. Origin of the Idea and Proposed Method
An analysis of
Figure 5 shows that
is insufficient for accurate location estimation, and that the location estimates change along the same line, even with high estimation errors. This observation suggests that the DoA estimation based on the multilateration method can be performed with high accuracy, independent of
.
The multilateration equations for a microphone array consisting of three microphones, with the first microphone as the reference, are shown in Equation (13). Assuming that the distance from the sound source to the first microphone,
, is known, Equation (13) can be rearranged as follows:
The simple representation of Equation (16) is as follows:
To perform localization on the plane, the distances of the microphones located on the perpendicular axes in the sensor array shown in
Figure 3a to the reference microphone are taken as
and
. Since the reference microphone is located at the origin,
. According to these, the
matrix is as follows:
vector:
If we consider that the second microphone is on the
x-axis and the third microphone is on the
y-axis, we can write
and
. It is clear that
and
, depending on the direction of sound arrival. Therefore, the
vector can be divided into two components and arranged as
The first component in Equation (20) is neglected, because it has a very small value compared to the second component. Therefore, Equation (16) can be approximated as follows:
The
position vector can be isolated from Equation (21) as follows:
As shown in Equation (22), the position vector of the sound source varies according to the TDoA values between the microphones. When the ratio of the
and
positions specified in Equation (1) is calculated according to Equation (22), it is equal to the ratio of the TDoA times, and the
θ azimuth angle:
Similarly, in the three-dimensional cartesian space, both the azimuth and elevation angles can be obtained using the four-microphone sensor array shown in
Figure 3b. For the sensor array shown in
Figure 3b, let the positions of the microphones be
and the position of the sound source be
. Accordingly, Equations (16) and (17) can be rearranged as follows:
In the new case, the fourth microphone is assumed to be in the z direction and the
and
matrices are rearranged as follows:
After reducing Equation (27) by making similar approximations as in Equation (20), the
position vector is as follows:
If we substitute the results found in Equation (28) into Equation (2), the
elevation angle becomes as follows:
In the case where the distances
between the microphones are equal, the Equations (23) and (29) take the following special form:
The sensor array layouts that satisfy Equations (23) and (29) are shown in
Figure 6.
3.2. Simulations
The first simulation studies were performed in two-dimensional space using the fifth microphone array shown in
Figure 6. In real-time applications, when measuring the TDoA time, hardware time errors of
, depending on the sampling frequency of the audio recording devices, and
measurement sample errors caused by environmental noise as a result of the GCC-Phat algorithm are introduced. These errors are modeled in our simulation study as shown in
Figure 4.
The simulation studies in the XY plane assumed that the distance of the sound source is 100 m. The largest measurement errors in the direction of arrival are at angles below 10° [
29].
Figure 7 shows the results obtained using different
distances in the case of
and
.
Figure 8 shows the effect of the source position on the direction-of-arrival estimation for near and far distances. The results for the source at 20 m and 500 m distances are shown when the sampling frequency and the microphone spacing are fixed.
The error changes of the estimates made with 1 degree increments between 0 and 90 degrees while the
distance increases from 5 m to 100 m with 1 m increments, and the
distance increases from 0.1 m to 1 m with 0.1 m increments are shown in
Figure 9. The error values are calculated using the mean absolute error function:
The expressions and represent the true and measured angle values, respectively.
The mean error decreases below after and . Additionally, the maximum error values are reached as the ratio approaches one.
The results obtained for different sampling frequencies with fixed microphone spacing and source position are shown in
Figure 10.
The simulation studies that led to the results in
Figure 7,
Figure 8,
Figure 9 and
Figure 10 were investigated under the assumption that the GCC-Phat error in the model shown in
Figure 4 is zero. In real-time applications, errors occur in the measurement of TDoA values due to environmental noises, reverberation, and differences in microphone detection patterns. Normally distributed noise with
was added instead of these errors, and the mean absolute error values were obtained by simulation with 1000 repetitions. The simulation results without the added error are shown in
Figure 11a, and the simulation results with the added error are shown in
Figure 11b.
Figure 11a shows that our proposed method produces better results than other methods except for the 55–70-degree range.
Figure 11b shows that our proposed method produces stable results than other methods up to 80 degrees. Parsayan and Ahadi’s method is more successful in the 79–89-degree range.
Table 1 shows the total average error values of the methods in the 0–90-degree range.
To investigate the change in the estimation error when the distance between the pairs of microphones placed perpendicularly is not equal, three different microphone array geometries were created with
,
, and
, ensuring that the hypotenuses of the triangles formed by the microphones have equal lengths. The simulation results without error are shown in
Figure 12a, and with noise are shown in
Figure 12b.
Although no clear difference can be seen in
Figure 12a, it can be seen in
Figure 12b that the three changing conditions are more successful in the 0–30-degree range and that two of them are more successful in the 60–90-degree range.
Table 2 shows the total average error values of the three different conditions in the 0–90-degree range.
The performance of the elevation angle estimation calculated by Equation (31) was investigated by simulation studies using the first microphone array in
Figure 6. The results obtained without adding error, assuming
,
,
, and azimuth angle
are shown in
Figure 13a, and obtained by adding noise are shown in
Figure 13b.
The average error values of the error graphs shown in
Figure 13 are presented in
Table 3.
The far-field approach cannot produce a solution when the expression to be taken the inverse of, especially at right angles, exceeds the range of
due to measurement errors in the TDoA value. This significantly impairs the measurement sensitivity at right angles. The solution in the far-field approach is also dependent on the speed of sound. The speed of sound varies depending on the weather conditions and affects the measurement results. Additionally, in the far-field approach, the number of samples obtained by the GCC-Phat algorithm is multiplied by
time and added to the formula. This increases the number of mathematical and algorithmic operations. Our proposed arrival angle detection method, since it is obtained from the source position equation, the quadrant in which the arrival angle is located, is precisely determined by the
function. Studies have been performed to determine the azimuth and elevation angles with the
function, but the quadrant information is still obtained by the algorithm [
21,
22,
23]. Our proposed method is obtained as the result of a single function, not as a function of conditions like other methods. Additionally, in the solutions obtained with the tangent function, in the regions where the cosine angle will be close to zero, indeterminateness occurs due to measurement errors in the TDoA values.
Our proposed method uses the sample differences directly obtained from the GCC-Phat algorithm. Therefore, it produces independent results. The function does not create indeterminateness when the value is zero. Our proposed method is faster and produces more stable results, especially against measurement errors.