1. Introduction
The Global Navigation Satellite Systems (GNSS) are widely and consistently utilized for outdoor positioning and navigation services [
1,
2]. However, when it comes to indoor positioning, challenges arise due to signal scattering, attenuation, and the multi-path propagation effects of wireless signals. As a result, the positioning performance degrades largely indoors [
3,
4]. In the meantime, indoor positioning is of great significance. To name a few, it is the basis for emergency safety, crowd monitoring, precision marketing, entertainment and life, and human social needs [
5,
6].
Currently, methods of indoor positioning include Bluetooth, Wi-Fi, ultra wide band (UWB), inertial measurement unit (IMU), and audio. Each technology has its own characteristics. Specifically, Wi-Fi [
7,
8] and Bluetooth positioning with fingerprint technologies [
9,
10] are easy to implement and compatible with mobile devices. However, the fingerprint-based positioning method [
11] requires pre-collecting a location fingerprint database, which is time-consuming and labor-intensive. UWB is able to achieve high accuracy positioning with the triangulation method [
12,
13]. However, the cost of the UWB module is high, while the technology has not been widely supported by current smartphones. IMUs [
14,
15] are frequently used in indoor pedestrian positioning systems due to their small size and low cost. However, the cumulative error limits their application in long-term positioning.
Acoustic-based indoor localization systems (AIPSs) [
16,
17,
18,
19,
20] offer several advantages over radio frequency (RF)-based [
21] and IMU-based positioning systems. The advantages include the following: (1) Low cost: AIPSs are generally more affordable compared with RF-based. The basic components required for acoustic localization, such as microphones and speakers, are relatively inexpensive and widely available. (2) High accuracy: AIPSs can provide high levels of accuracy in indoor environments. Acoustic waves propagate at a lower speed compared with RF signals, allowing for more precise distance measurements and localization calculations. (3) High availability: Under conditions where the human body and furniture obstruct, acoustic waves will diffract and propagate towards the microphone, which means that AIPSs can function effectively in various indoor settings without significant signal degradation. (4) Easy integration and handling: AIPSs are relatively easy to integrate into existing infrastructures. They can be easily incorporated into buildings or indoor environments without requiring extensive modifications. Furthermore, the handling and maintenance of AIPSs are typically straightforward. Due to these advantages, acoustic-based indoor localization systems have gained prominence in the field of ranging and positioning technologies alongside RF and IMU-based systems.
Based on the above property, researchers are now investigating the possibility of using acoustic signals for indoor positioning. An actual linear frequency modulation (LFM) signal was applied as the positioning source (the LFM signal is also called a chirp signal). In reference [
22], the author estimated the TOA of a chirp acoustic signal. However, the author did not achieve positioning for a smartphone. In the article [
23], the author uses indoor acoustic fingerprints to achieve room-level positioning and differentiation. However, the actual application is greatly affected by environmental noise because of a lack of acoustic base. In reference [
24], chirp signals increasing linearly in frequency are used to code the one. A chirp signal decreasing linearly is used to code the zero. When the acoustic node information is decoded, it is considered that the smartphone and the acoustic node are in the same area. However, the system cannot obtain TDOA information. Therefore, decimeter-level positioning cannot be achieved. In the article, the author proposed a transmission scheme of time division multiple access (TDMA) plus frequency division multiple access (FDMA). However, the number of acoustic nodes in this scheme is limited because of hardware conditions. It is difficult to deploy acoustic nodes in practical applications. Meanwhile, the system only generated two codes assigned to four acoustic nodes, which can easily lead to incorrect identification of acoustic nodes. In reference [
25], the author employed the PRN signal to encode the transmitted audio signal. Due to the absence of up-conversion for the PRN signal, the received signal at the receiver end is susceptible to environmental noise interference. In reference [
26,
27], the author uses a microphone array module to locate the target. However, the system is suitable for robot platforms and is not suitable for users in daily life. In reference [
28], the author uses TDOA as a fingerprint to locate the sound source. The capacity of users is limited in such systems. At the same time, the TDOA-fingerprint method requires collecting fingerprint information in advance, which is time-consuming and laborious. In reference [
29], the authors conducted a study on utilizing FDM-CDMA sound signals for indoor localization of unmanned aerial vehicles (UAVs). They employed time of arrival (TOA) information obtained through the maximum value of cross-correlation. However, it is worth noting that actual indoor environments present challenges such as multipath interference resulting from obstructions. As a result, the maximum value of the cross-correlation detection method may not be suitable for real-world scenarios. In reference [
30], the authors utilized a threshold method for TOA estimation and developed a frequency division, spatial division, and time division positioning system, achieving good positioning results. However, this system has two key shortcomings: (1) The TDOA measurement method uses signals from adjacent base stations for measurement (for example,
,
and
). This is because the system adopts ARM architecture, resulting in a clock offset issue. (2) The system’s encoding signals are only two, allocated to multiple base stations. It is necessary to inform in advance in which area. Multiple base station identifications cannot be achieved. In reference [
31], the author utilized an acoustic single base station to measure the relative displacement of mobile phones, combined with PDR information, achieving good positioning results. However, in this system, the clock of the acoustic single base station will drift over time, and the author did not analyze this issue. Furthermore, the author did not encode the signals or discuss and analyze multipath issues. In reference [
32], the author utilized the semantic information of acoustic signals for indoor localization. However, in practical applications, it is susceptible to factors such as noise and human obstruction, making it impractical for real-world scenarios. In reference [
33], the author proposed a combination of the normalization method and threshold method for detecting the first path. However, in practical scenarios with strong shadowing, the strength of the first path is attenuated, making it challenging to determine an appropriate threshold. In reference [
34], the author proposed a positioning scheme for underground spaces utilizing acoustic signals. However, there was a lack of detailed discussion regarding the impact of multipath and potential solutions to mitigate it. To further elaborate, we categorize the above explanations. In terms of signal layer research, the authors used chirp in articles [
22,
30,
31], which cannot simultaneously satisfy the coding of multiple base stations. Regarding ranging capabilities, in articles [
28,
32], effective ranging could not be achieved. In terms of system synchronization, articles [
30,
31] did not achieve true synchronization and did not discuss clock offset in the system. For signal TOA and TDOA estimation, in articles [
33,
34], the authors did not propose improved methods and continued to use the classic threshold method and cross-correlation maximum method. Regarding user capacity, in articles [
25,
26,
27], the system is not applicable to multiple users. When multiple users transmit signals simultaneously, the system cannot locate them.
Based on the above discussions, exploring robust acoustic positioning systems still has its significance in both academia and industrial fields. In this work, a RATBILS system is developed, which only requires users to carry mobile phones for direct positioning without the need for additional auxiliary devices. Due to detrimental factors such as high levels of noise, echoes, multipath propagation, and the doppler effect in indoor acoustic channels, the quality of the transmitted signal is reduced, limiting the communication range and resulting in errors in demodulation. Traditional digital modulation techniques for wireless communication, such as amplitude, and phase modulation, are not directly suitable for our system. Additionally, it is worth noting that acoustic waves experience considerably higher attenuation (i.e., decrease in signal strength with distance) compared with electromagnetic waves of similar frequencies, resulting in low-amplitude received signals and limiting the operating range of the system. As stated in article [
35], the achievement was limited to acoustic communication within a range of 0.7 m. In the fields of underwater acoustic communication [
36,
37,
38] and wireless communication [
39,
40,
41], chirp spread spectrum (CSS) techniques are recognized as highly efficient information transmission schemes. In CSS transmission schemes, the receiver employs matched filtering (MF) to optimize the signal-to-noise ratio (SNR), resulting in an extended communication range. Furthermore, these techniques exhibit exceptional effectiveness in dealing with low-amplitude received signals, interference, and selective fading. Based on CSS, we have developed the FDM-CSS solution, which balances positioning performance and coding effectiveness.
During the positioning process, acquiring distance information is a crucial step for smartphones. However, challenges arise in obtaining reliable distance information due to factors like indoor multipath and human obstruction. Even though we utilize the smartphone’s internal microphone for signal reception, its quality is compromised by the presence of the phone case. The signal received is inferior to that obtained by using the custom-made microphone. To enhance the reliability of distance measurement, we have devised coarse detection and fine detection techniques. During the coarse detection stage, we employ a combination of spectral subtraction and MF-backtracking to ascertain the approximate starting position of the receiving signal. During the fine detection stage, we combine the multi-threshold grouping method with normalization techniques to enhance the intensity of the first path and improve the accuracy of detection.
Acoustic indoor positioning technology leverages microphones and speakers for localization, and it possesses the following characteristics: Due to the relatively slower propagation speed of sound compared with radio frequencies, the synchronization requirements for achieving high-precision acoustic positioning are reduced. For location-based service providers, although deploying acoustic nodes indoors is necessary for acoustic indoor positioning technology, the affordability of commercial acoustic components is expected to enable cost-effective infrastructure investments while achieving sub-meter-level high-precision indoor positioning. For users, since microphones and speakers are standard features on handheld mobile devices, they can receive high-precision positioning services without any additional costs. These advantages have sparked an increasing interest among researchers in the field of acoustic indoor positioning technology. We also took these advantages into consideration and developed the RATBILS system. Specifically, we have conducted the following work:
- (1)
We have designed an active sensing system that allows smartphones to determine their location information without requiring any additional sensors for users.
- (2)
We propose a robust time-delay estimation algorithm, referred to as the coarse detection and fine detection method, which provides a reliable guarantee for accurate TDOA measurements and localization.
- (3)
We have designed a FIR-MF detector for detection of encoded chirp signals transmitted by different acoustic nodes.
- (4)
Our method is capable of adapting to adverse conditions such as human body occlusion and strong multipath interference through coarse and fine detection.
- (5)
Our extensive experimental results demonstrate that our proposed system exhibits accuracy and robustness for smartphone localization across two real-world scenarios.