Leveraging Bluetooth and GPS Sensors for Route-Level Passenger Origin–Destination Flow Estimation
Abstract
:1. Introduction
2. Literature Review
2.1. Automated Fare Collection (AFC)-Based Approaches
2.2. Wireless Signal Detection Methods
2.3. Video Analytics and IoT Solutions
2.4. Hybrid and Emerging Methodologies
2.5. Limitations of Existing Approaches
3. Methodology
3.1. Feature Computation
3.1.1. MAC Address Features
- 1.
- Detection Times (): Total number of detections for MAC address m, quantified as the count of data entries containing m in the Bluetooth dataset.
- 2.
- Detection Duration (): Temporal span between the initial and final detection instances of MAC address m, measured in seconds (s). This metric is computed as follows:
- 3.
- Mean RSSI (): The average received signal strength indication (RSSI) for MAC address m, measured in dBm. This metric characterizes the aggregate signal intensity level and is computed as follows:
- 4.
- Maximum RSSI (): The peak RSSI value observed for MAC address m, measured in dBm. This metric reflects either the closest physical proximity between the device and detector or the optimal signal interaction state, calculated as follows:
3.1.2. Vehicular Mobility Feature
- 1.
- Initial Detection Distance (): The distance between the geographic location where Bluetooth MAC address m is initially detected and the nearest transportation hub, measured in meters (m). This parameter is calculated using the following formula:
- 2.
- Final Detection Distance (): The linear distance, measured in meters (m), between the geographic position of the last detected instance of MAC address m and its nearest transportation node. The computational methodology for this metric aligns with that of the initial detection distance ().
- 3.
- Travel Distance (): The total travel distance, in meters (m), traversed by the vehicle between the first and last detection timestamps of Bluetooth MAC address m. This parameter is derived through the following formula:
- 4.
- Average Velocity (): The mean speed, expressed in meters per second (m·s−1), of the vehicle during the interval between the first and last detections of MAC address m. This parameter is derived through the following formula:
- 5.
- Maximum Velocity (): The peak instantaneous speed, measured in meters per second (m·s−1), attained by the vehicle during the detection period of Bluetooth MAC address m. This parameter is derived through the following formula:
3.2. Passenger–Non-Passenger Differentiation
3.2.1. Objective Function
3.2.2. Membership Update
3.2.3. Centroid Update
4. Solution
4.1. Rationale for Clustering Algorithm Selection
- (a)
- Label ambiguity: Ground truth labels for passenger status are typically unavailable in transit monitoring systems. FCM autonomously discovers latent cluster structures without requiring pre-classified training data [25].
- (b)
- Feature overlap: The nine-dimensional feature space exhibits nonlinear correlations between variables (e.g., detection duration vs. average speed). FCM’s probabilistic membership assignment handles overlapping cluster boundaries better than hard clustering methods [26].
- (c)
- Dimensional complexity: With nine heterogeneous features spanning temporal, spatial, and signal strength domains, the Mahalanobis distance metric in FCM effectively weights feature contributions during cluster formation.
4.2. Pseudocode for Solving the Model
Algorithm 1: Passenger Classification via FCM Clustering |
Input: Normalized feature matrix X, Cluster count c = 2, Fuzziness exponent = 2.0, Convergence threshold = 10−5, Max iterations T = 1000 |
Output: Membership matrix U, Cluster centroid matrix V |
1. Initialize with random memberships satisfying |
2. Compute initial centroids via Equation (15) |
3. Repeat: |
a. Update using Equation (14) |
b. Update using Equation (15) |
c. Calculate via Equation (13) |
d. Until or |
4. Assign labels: class() = |
5. Model Validation
5.1. Experimental Context
- (1)
- Install Bluetooth detectors (located in the middle section of the vehicle) and collect Bluetooth MAC address data during the journey, which serve as the input for the proposed method. The collected Bluetooth MAC address data include information such as detection time, MAC address, Category (BT/BLE), Rssi, etc.
- (2)
- Vehicle GPS data, which serve as the input for the proposed method. The collected vehicle GPS data include information such as detection time, longitude, latitude, direction of travel (indicating the driving direction of the vehicle), the station that the vehicle is about to arrive at, etc.
- (3)
- Collection of ground truth: Firstly, determine the arrival times of the target vehicle for two trips and the vehicle preceding the target vehicle (i.e., the vehicle that arrives at each station earlier than the target vehicle) at each station, as shown in Figure 4a,b. Passengers whose arrival times at the stations are between those of the target vehicle and the vehicle preceding the target vehicle are regarded as passengers taking the target vehicle. For example, in Figure 4a, for Station 2, the arrival time of the vehicle preceding the target vehicle is 10:02:05, and the arrival time of the target vehicle is 10:09:27. The APC data of passengers who swipe their cards to enter the station at Station 2 within this time interval (including the time of entering and exiting the station and the station number of getting on and off the vehicle) are regarded as the data of passengers getting on the target vehicle at Station 2, and so on.
5.2. Experimental Results
- (1)
- To quantify the performance of the proposed method in predicting boarding and alighting ratios, a multi-stage computational approach is adopted. Specifically, the mean absolute error (MAE) between the BLE estimates and ground truth values is calculated, based on which the accuracy rate is derived. The detailed formulas and procedures are outlined as follows.
- (2)
- Comparison with the method [28] of estimating the origin–destination (OD) based on historical boarding and alighting data (HD-based): In this part, we will select three indicators that are widely used to evaluate the prediction accuracy, namely the mean squared error (MSE), the mean absolute error (MAE), and the similarity, to analyze the OD values estimated by different methods. Moreover, during the calculation process, the data of Station 1 and Station 13 are excluded to avoid the interference of possible special situations on the results.
- (a)
- Mean squared error (MSE). The mean squared error is a commonly used indicator for measuring the degree of deviation between the predicted values and the true values. By averaging the squares of the prediction errors, it can comprehensively reflect the overall deviation of the prediction results. Its calculation formula is shown as follows:
- (b)
- Mean absolute error (MAE). The mean absolute error is another important indicator for evaluating the prediction accuracy. It directly calculates the average of the absolute errors between the predicted values and the true values and intuitively reflects the average degree of deviation between the prediction results and the true values. Its calculation formula is shown as follows:
- (c)
- Similarity. The cosine similarity is employed to measure the degree of similarity between the estimated origin–destination (OD) matrices. The reason for using the cosine similarity is that it can effectively quantify the angular similarity between two vectors without being affected by the magnitudes of the vectors. This is particularly useful when comparing the OD flow patterns estimated by different methods. In the context of OD matrix estimation, vectors represent the passenger flow distribution between different origin–destination pairs. The cosine similarity can highlight the similarity of flow patterns without being overly influenced by the absolute values of the flow volume, thus more meaningfully measuring the similarity of the estimated OD structures. The calculation formula for the cosine similarity is as follows:
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
BT | Bluetooth |
FCM | Fuzzy C-Means |
MAE | Mean Absolute Error |
MSE | Mean Squared Error |
HD | Historical boarding and alighting data |
References
- Sun, W.; Schmocker, J.D.; Fukuda, K. Estimating the route-level passenger demand profile from bus dwell times. Transp. Res. Part C Emerg. Technol. 2025, 130, 103273. [Google Scholar]
- Demissie, M.G.; Kattan, L. Estimation of truck origin-destination flows using GPS data. Transp. Res. Part E Logist. Transp. Rev. 2025, 159, 102621. [Google Scholar] [CrossRef]
- Zhang, C.; Chen, X.; Zhao, J.; Jiang, Z.; Chung, E. Estimating bus passenger origin-destination flow via passenger reidentification using video images. IEEE Trans. Intell. Transp. Syst. 2025, 1–15. [Google Scholar] [CrossRef]
- Shafaeipour, N.; Stanciu, V.D.; van Steen, M.; Wang, M. Understanding the protection of privacy when counting subway travelers through anonymization. Comput. Environ. Urban Syst. 2025, 110, 102091. [Google Scholar]
- Chung, M. Real-time passenger counting in streetcars using high sound frequency signals. IEEE Sens. J. 2025, 15, 965–970. [Google Scholar]
- Demetrio, A.; Elgner, F.; Hameister, H.; Quinting, M.; Warzok, D.; Wendel, J. Large-scale Wi-Fi and Bluetooth data collection for reconstructing passenger flows. J. Locat. Based Serv. 2024, 18, 185–204. [Google Scholar]
- Kostakos, V.; Camacho, T.; Mantero, C. Towards proximity-based passenger sensing on public transport buses. Pers. Ubiquitous Comput. 2013, 17, 1807–1816. [Google Scholar]
- Friesen, M.R.; McLeod, R.D. Bluetooth in intelligent transportation systems: A survey. Int. J. Intell. Transp. Syst. Res. 2014, 13, 143–153. [Google Scholar]
- Owais, M. Deep learning for integrated origin-destination estimation and traffic sensor location problems. IEEE Trans. Intell. Transp. Syst. 2024, 25, 6501–6513. [Google Scholar]
- Pu, Z.; Cui, Z.; Zhu, M.; Wang, Y. Mining public transit ridership flow and origin-destination information from Wi-Fi and Bluetooth sensing data. Transportation Research Part C. arXiv 2019, arXiv:1911.01282, 2019. [Google Scholar]
- Lee, I.; Cho, S.H.; Kim, K.; Kho, S.Y.; Kim, D.K. Travel pattern-based bus trip origin-destination estimation using smart card data. PLoS ONE 2022, 17, e0270346. [Google Scholar] [CrossRef] [PubMed]
- Zhao, D.; Mihaita, A.S.; Ou, Y.; Grzybowska, H.; Li, M. Origin-destination matrix estimation for public transport: A multi-modal weighted graph approach. Transp. Res. Part C Emerg. Technol. 2025, 165, 104694. [Google Scholar]
- Jin, M.; Wang, M.; Gong, Y.; Liu, Y. Spatio-temporally constrained origin-destination inferring using public transit fare card data. Phys. A Stat. Mech. Its Appl. 2022, 603, 127642. [Google Scholar]
- Ryu, S.; Park, B.B.; El-Tawab, S. WiFi sensing system for monitoring public transportation ridership: A case study. IEEE Trans. Intell. Transp. Syst. 2020, 24, 3092–3104. [Google Scholar]
- Servizi, V.; Persson, D.R.; Pereira, F.C.; Villadsen, H.; Bækgaard, P.; Peled, I.; Nielsen, O.A. “Is Not the Truth the Truth?”: Analyzing the impact of user validations for bus in/out detection in smartphone-based surveys. IEEE Trans. Intell. Transp. Syst. 2023, 24, 11905–11920. [Google Scholar]
- Ferreira, M.C.; Dias, T.G.; Cunha, J.F. Anda: An innovative micro-location mobile ticketing solution based on NFC and BLE technologies. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6316–6325. [Google Scholar]
- Minea, M.; Dumitrescu, C.; Costea, I.M.; Chiva, I.C.; Semenescu, A. Developing a solution for mobility and distribution analysis based on Bluetooth and artificial intelligence. Sensors 2020, 20, 7327. [Google Scholar] [CrossRef]
- Tan, Z.; Li, X.; Liu, Y. Bus passenger origin-destination flow estimation using entry-only smartcard data: A self-supervised learning method without alighting data. IEEE Trans. Intell. Transp. Syst. 2025, 26, 4808–4822. [Google Scholar]
- Li, H.; Wang, Y.; Xu, X.; Qin, L.; Zhang, H. Short-term passenger flow prediction under passenger flow control using a dynamic radial basis function network. Appl. Soft Comput. J. 2019, 83, 105620. [Google Scholar]
- Nguyen, K.A.; Wang, Y.; Li, G.; Luo, Z.; Watkins, C. Realtime tracking of passengers on the London Underground transport by matching smartphone accelerometer footprints. Sensors 2019, 19, 4184. [Google Scholar] [CrossRef]
- Fabre, L.; Bayart, C.; Bonnel, P.; Mony, N. The potential of Wi-Fi data to estimate bus passenger mobility. Technol. Forecast. Soc. Change 2025, 192, 122509. [Google Scholar]
- González, A.B.R.; Diaz, J.J.V.; Wilby, M.R. Detailed origin-destination matrices of bus passengers using radio frequency identification. IEEE Trans. Intell. Transp. Syst. 2025, 14, 141–152. [Google Scholar]
- Pu, Z.; Zhu, M.; Li, W.; Cui, Z.; Guo, X.; Wang, Y. Monitoring public transit ridership flow by passively sensing Wi-Fi and Bluetooth mobile devices. IEEE Internet Things J. 2020, 8, 474–486. [Google Scholar]
- Huang, Z.; Mireles de Villafranca, A.E.; Sipetas, C.; Quach, T. Crowd-sensing commuting patterns using multi-source wireless data: A case of Helsinki commuter trains. arXiv 2023, arXiv:2302.02661. [Google Scholar]
- Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar]
- Sun, H.; Wang, S.; Jiang, Q. FCM-based model selection algorithms for determining the number of clusters. Pattern recognition 2004, 37, 2027–2037. [Google Scholar]
- Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj Comput. Sci. 2021, 7, e623. [Google Scholar]
- Ji, Y.X.; Tuo, S.J.; Misulanilabi; Micle, D. Estimation method of bus passenger trip origin and destination based on boarding and alighting passenger counts. J. Tongji Univ. 2013, 41, 1020–1024+1118. [Google Scholar]
- Soltani Naveh, K.; Kim, J. Urban trajectory analytics: Day-of-week movement pattern mining using tensor factorization. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2540–2549. [Google Scholar]
Feature Computation | |
---|---|
m | Bluetooth MAC address |
t | Timestamp |
Longitude of Bluetooth MAC address m detected at timestamp t | |
Latitude of Bluetooth MAC address m detected at timestamp t | |
Total detection count of Bluetooth MAC address m | |
Detection duration of Bluetooth MAC address m | |
Last detection timestamp t for Bluetooth MAC address m | |
First detection timestamp t for Bluetooth MAC address m | |
Average RSSI value of Bluetooth MAC address m | |
Maximum RSSI value of Bluetooth MAC address m | |
RSSI value recorded when detecting Bluetooth MAC address m at timestamp t | |
Timestamp set of vehicle GPS data | |
Bluetooth timestamp set of MAC address m | |
Matched timestamp set after aligning Bluetooth timestamps with vehicle GPS timestamps | |
Upcoming station number of the vehicle at timestamp t | |
Timestamp set corresponding to (upcoming station number−1) before first detection time | |
Timestamp set corresponding to (upcoming station number + 1) before first detection time | |
Distance from preceding station for Bluetooth MAC address m at timestamp t | |
Distance to subsequent station for Bluetooth MAC address m at timestamp t | |
Minimum distance to the nearest station during first detection of Bluetooth MAC address m | |
Minimum distance to the nearest station during last detection of Bluetooth MAC address m | |
Total travel distance between first and last detection instances of Bluetooth MAC address m | |
Average travel speed during detection period of Bluetooth MAC address m | |
Maximum instantaneous speed during detection period of Bluetooth MAC address m | |
Time interval between consecutive timestamps t + 1 and t | |
Passenger–Non-passenger Differentiation | |
Objective function | |
U | Membership matrix |
V | Cluster centroid matrix |
k-th data sample | |
Membership degree of sample k to cluster i | |
i-th cluster centroid | |
n | Number of MAC addresses |
Fuzzification exponent | |
c | Number of clusters (fixed at 2: passenger/non-passenger) |
A | Covariance matrix |
X | Input feature matrix |
Result Validation | |
i | The i-th station |
Ground-truth boarding count at station i | |
Ground-truth alighting count at station i | |
Bluetooth-detected boarding count at station i | |
Bluetooth-detected alighting count at station i | |
Ground-truth boarding proportion at station i | |
Ground-truth alighting proportion at station i | |
Bluetooth-detected boarding proportion at station i | |
Bluetooth-detected alighting proportion at station i |
Scenario | ACCa | ACCb |
---|---|---|
Morning Peak | 91.22% | 96.02% |
Evening Peak | 95.18% | 95.52% |
Scenario | OD Estimation Method | MSE | MAE | Similarity |
---|---|---|---|---|
Morning Peak | BT-based | 0.377 | 0.189 | 0.763 |
HD-based | 0.506 | 0.197 | 0.727 | |
Evening Peak | BT-based | 0.114 | 0.083 | 0.782 |
HD-based | 0.378 | 0.121 | 0.738 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, J.; Pan, Z.; Zhang, C.; Yang, X. Leveraging Bluetooth and GPS Sensors for Route-Level Passenger Origin–Destination Flow Estimation. Sensors 2025, 25, 2351. https://doi.org/10.3390/s25082351
Xu J, Pan Z, Zhang C, Yang X. Leveraging Bluetooth and GPS Sensors for Route-Level Passenger Origin–Destination Flow Estimation. Sensors. 2025; 25(8):2351. https://doi.org/10.3390/s25082351
Chicago/Turabian StyleXu, Junming, Zhenxing Pan, Cheng Zhang, and Xiaoguang Yang. 2025. "Leveraging Bluetooth and GPS Sensors for Route-Level Passenger Origin–Destination Flow Estimation" Sensors 25, no. 8: 2351. https://doi.org/10.3390/s25082351
APA StyleXu, J., Pan, Z., Zhang, C., & Yang, X. (2025). Leveraging Bluetooth and GPS Sensors for Route-Level Passenger Origin–Destination Flow Estimation. Sensors, 25(8), 2351. https://doi.org/10.3390/s25082351