Mobility-Aware Privacy-Preserving Mobile Crowdsourcing †
Abstract
:1. Introduction
- Challenge 1. The traditional Markov method can only model the steady-state transitions. However, mobile user often travels with different patterns at different times in the real world. Therefore, to model the user’s dynamic mobility based on the Markov chain, the first thing to overcome is the time correlation that the traditional Markov model lacks, in both the transfer model and steady-state distribution [14,15];
- Challenge 2. Considering the incompleteness of sampling data, how to apply the random sampling methods flexibly for establishing an unbiased spatiotemporal Markov model needs to be fully taken into consideration;
- Challenge 3. Performance-evaluation issues, such as how many POIs are generated in each time partition is suitable, and how much the generated POIs are related to the user, need to be evaluated with appropriate indicators.
- We have sorted out the existing location privacy-preserving techniques, analyzed their technical vulnerabilities, and finally clarified our research problem;
- A time-partitioning concept has been introduced into the traditional Markov model, forming a new spatiotemporal Markov, named TMarkov. TMarkov can model the mobile user’s time-varying behavioral patterns;
- We have performed an unbiased estimate of the TMarkov model, according to the Gibbs Sampling method;
- We have selected suitable technical indicators carefully and conducted extensive experiments with the real-world dataset to evaluate the performance of TMarkov.
2. Problem Formalization
2.1. General Privacy-Preserving Mobile Crowdsourcing
2.2. Attack Models
- Inferential attacks based on crowdsourcing elements. According to the task’s location accepted by the user and the predesigned maximum acceptable distance (MAD), the platform can draw a circular area, taking the accepted task as the center and MAD as the radius. The circle is the effective area that the user must be in when doing the crowdsourcing task, as shown in Figure 2a. The above subfigure shows the scenario when the user accepts three tasks simultaneously. The user must appear in the intersection of the three effective areas. The bottom subfigure presents the area inference when the user accepts task B but rejects tasks A and C. The user must be in the area close to B but away from A and C;
- Semantic analysis on the mobile trajectories. Traveling trajectory is the information carrier of user’s daily mobile semantics. The correlated information between the locations and the mobile user may review the user’s sensitive privacy. In Figure 2b, we take a trajectory along which a user travels on a normal working day as an example. A user leaves position 1 in the morning, stays at position 2 in the morning and afternoon, and returns to position 1 in the evening. It is easy to infer that the location 1 is the user’s home and location 2 is the place where the user works;
- Inferential attacks based on continually shared locations. The platform has accumulated a large number of historical trajectories. It can analyze the spatiotemporal correlations hidden in the mobile data, model the user’s mobility, and then may infer the user’s following behavior. As shown in Figure 2c, after the user has completed tasks , the platform may be able to infer that location 1 is the place where the user is most likely to visit next, according to the spatiotemporal-correlation inferential attacks;
- Inferential attacks based on road-constraints and other background-knowledge. In practical applications, the platform may re-identify the user-generated locations based on the real-world’s road-network constraints. As shown in Figure 2d, only position 1 is actually reachable in the effective area corresponding to task D. Other background knowledge may also be used in this way by the platform, such as the user’s social relationships.
2.3. Our Design Goals
3. System Model
3.1. A Glimpse of TMarkov’s Application Scenario
3.2. System Design and Its Rationality
4. Mobility-Aware Trajectory Prediction
4.1. Time-Related Mobility Perception
4.2. The Transfer Model’s Unbiased Estimation
Algorithm 1: Two-dimensional Gibbs sampling |
1 Initialization |
2 At t = 0,1,..., loop sampling |
2.1 |
2.2 |
4.3. Future Behavioral Trajectory Prediction
4.4. Complexity Analysis
5. Experimental Evaluation
5.1. Real-World Dataset
5.2. Experimental Setting
5.3. Predicted Trajectory Exhibitions
5.4. Average Coverage Ratio
5.5. Cumulative Distribution of Time-Partition’s Proportion on the Coverage Ratio
6. Discussion
6.1. Application Modes of Our TMarkov
- Deploy our TMarkov solution in the online application. The platform already has plenty of the user’s historical mobile data. It can model the spatiotemporal correlations hidden in the user’s mobility and launch the inferential attacks to infer the user’s actual travel. Our design goal is to resist the platform’s inferential attacks while minimizing the changes to the existing architecture and facilities.TMarkov models the spatiotemporal association based on the user’s personal historical data to simulate the platform’s attack-capacity. An anonymous set is constructed using the user’s most likely locations to visit within a certain area. Our TMarkov breaks the spatiotemporal correlation that the platform’s attacks rely on, realizing the privacy-preserving mobile application;
- The second scenario corresponds to the cold-start problem of new user in online applications. The platform does not have any of the user’s personal historical data. In this scenario, the platform does not have the capacity to infer and attack the new user’s personal behavior patterns.To prevent the platform from obtaining the new user’s traveling information, we can replace personal mobile data (personal data sets) with public users’ traveling data (public datasets). According to the TMarkov trained with public datasets, the location’s popularity can be characterized based on public users’ access. Before participating in the application, the user pre-specifies a small area (or area-sequence) based on the actual travel. TMarkov builds an anonymous set consisting of the most popular locations within the specified area. The platform recommends crowdsourcing tasks to the new user.Here, the attacker can only infer the user’s travel based on the crowdsourcing task accepted by the user. A circle can be drawn as the effective area where the user must visit while doing the task, taking the accepted task’s position as the center, and the user’s maximum traveling distance for accepting the task as the radius. The user usually takes 5 km as the maximum acceptable distance. Then, the effective area of the platform’s attack is a circular area with a diameter of 10 km. This area is quite large that we think the user’s location privacy is safe in this case;
- The third scenario is the platform that is newly launched without any data. Here, the platform has the weakest attack capability. The users only need to adopt the spatial privacy-preserving method to protect their actual locations. Such as generalization (generalize the location into a small area), obfuscation (build an anonymous set with surrounding POIs), perturbance (add random noise to the actual position based on the DP principle), etc.
6.2. Future Work
- The space complexity of our TMarkov is relatively high. In the experiments, we divided the area within Beijing’s 3rd Ring-Road into 20 × 20 equal-size blocks, and the target time-period from 8 a.m. to 6 p.m. into 10 time-partitions. Finally, our TMarkov has the transfer matrix with dimensions as high as 4000 × 4000.Observing the trained TMarkov transfer matrix, we find that it is very sparse, and most of the transition probability is 0. It is because the number of locations visited in the general-user’s daily life is very limited. Based on this discovery, we can optimize its modeling space with the points of interest that the user actually visited and stayed. This optimization can significantly reduce the space complexity of TMarkov;
- Our TMarkov has a strong universality, but its function is relatively single. We can consider further developing and designing new privacy-preserving mechanisms based on it. TMarkov generates the user’s time-related probability distribution on location-set and models the user’s time-varying mobile behavior patterns. Therefore, it can be widely applied in mobile modeling scenarios. Taking it as a core component for mobile modeling, we can further develop and design new privacy-preserving mechanisms for solving more complex security issues.For example, we can update the user’s time-related probability distribution between two participations in a continual crowdsourcing scenario based on the Bayesian posterior theorem, taking the accepted task as a condition. Then, we continue to execute TMarkov for the user to participate in the crowdsourcing application again. Such an improved solution can eliminate the privacy risk brought by the accepted tasks on the user’s subsequent participation. We can also provide the differential privacy (DP) mechanism with the anonymous set constructed from the user’s time-related probability distribution generated by TMarkov, for achieving the spatiotemporally correlated DP solution. And so on.
7. Related Work
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, Y.; Lv, P.; Guo, D.; Zhou, T.; Xu, M. Task and participant matching in mobile crowd sensing: A Survey. Springer J. Comput. Sci. Technol. 2018, 33, 768–791. [Google Scholar] [CrossRef]
- Chen, Y.; Guo, D.; Lv, P.; Zhou, T.; Xu, M. ProSC: Profit-driven participant selection in compressive mobile crowdsensing. In Proceedings of the 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018. [Google Scholar]
- Chen, Y.; Lv, P.; Guo, D.; Zhou, T.; Xu, M. Trajectory segment selection with limited budget in mobile crowd sensing. Elsevier J. Pervasive Mob. Comput. 2017, 40, 123–138. [Google Scholar] [CrossRef]
- Gramaglia, M.; Fiore, M.; Tarable, A. Preserving mobile subscriber privacy in open datasets of spatiotemporal trajectories. In Proceedings of the IEEE INFOCOM 2017—IEEE Conference on Computer Communications, Atlanta, GA, USA, 1–4 May 2017. [Google Scholar]
- Oya, S.; Troncoso, C.; Perezgonzalez, F. Back to the drawing board: Revisiting the design of optimal location privacy-preserving mechanisms. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017. [Google Scholar]
- Gotz, M.; Nathn, S.; Gehrke, J. Maskit Privately releasing user context streams for personalized mobile applications. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, USA, 20–24 May 2012. [Google Scholar]
- Papadopoulos, S.; Bakiras, S.; Papadias, D. Nearest neighbor search with strong location privacy. Springer VLDB Endow. 2010, 3, 619–629. [Google Scholar] [CrossRef]
- Andres, M.E.; Bordenabe, N.; Chatzikokolakis, K.; Palamidessi, C. Geo-indistinguishability: Differential privacy for location-based systems. In Proceedings of the 2013 ACM Sigsac Conference on Computer & Communications Security, Berlin, Germany, 4–8 November 2013. [Google Scholar]
- Gedik, B.; Liu, L. Protecting location privacy with personalized k-anonymity: Architecture and algorithms. IEEE Trans. Mob. Comput. 2008, 7, 1–18. [Google Scholar] [CrossRef]
- Xiao, Y.; Xiong, L. Protecting locations with differential privacy under temporal correlations. arXiv 2014, arXiv:1410.5919. [Google Scholar]
- Jin, X.; Zhang, R.; Chen, Y. DPSense: Differentially private crowdsourced spectrum sensing. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications, Vienna, Austria, 24–28 October 2016. [Google Scholar]
- Shokri, R.; Theodorakopoulos, G.; Boudec, J.L.; Hubaux, J. Quantifying location privacy. In Proceedings of the 32nd IEEE Symposium on Security and Privacy, S&P 2011, Berleley, CA, USA, 22–25 May 2011. [Google Scholar]
- Shokri, R.; Theodorakopoulos, G.; Troncoso, C. Protecting location privacy: Optimal strategy against localization attacks. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh, CA, USA, 16–18 October 2012. [Google Scholar]
- Lv, Q.; Mei, Z.; Qiao, Y.; Zhong, Y.; Lei, Z. Hidden markov model based user mobility analysis in LTE network. In Proceedings of the 2014 International Symposium on Wireless Personal Multimedia Communications (WPMC), Sydney, NSW, Australia, 7–10 September 2014. [Google Scholar]
- Li, X.; Lian, D.; Xie, X.; Sun, G. Lifting the predictability of human mobility on activity trajectories. In Proceedings of the 2015 IEEE International Conference on Data Mining Workshop (ICDMW), Atlantic City, NJ, USA, 14–17 November 2015. [Google Scholar]
- Bhakti, M.; Shelar, D.; Chitre, D.K. Hidden markov model with biclustering cache replacement policy for location based services in MANET. Int. J. Eng. Comput. Sci. 2015, 4, 12000–12004. [Google Scholar]
- Guo, B.; Liu, Y.; Wang, L.; Li, V.O.; Lam, J.C.; Yu, Z. Task allocation in spatial crowdsourcing current state and future directions. IEEE Internet Things J. 2018, 5, 1749–1764. [Google Scholar] [CrossRef]
- Shokri, R.; Theodorakopoulos, G. Location Privacy Meter Tool. Location Privacy. 2011. Available online: https://github.com/rzshokri/quantifying (accessed on 12 May 2020).
- Robert, C.; Celeux, G.; Diebolt, J. Bayesian estimation of hidden Markov chains: A stochastic implementation. IEEE Stat. Probab. Lett. 1993, 16, 77–83. [Google Scholar] [CrossRef]
- Zheng, Y.; Zhang, L.; Xie, X.; Ma, W. Mining interesting locations and travel sequences from GPS trajectories. In Proceedings of the ACM WWW, Madrid, Spain, 20–24 April 2009. [Google Scholar]
- Zheng, Y.; Li, Q.; Chen, Y.; Xie, X.; Ma, W. Understanding mobility based on GPS Data. In Proceedings of the ACM Ubicomp, Seoul, Korea, 21–24 September 2008. [Google Scholar]
- Zheng, Y.; Xie, X.; Ma, W. GeoLife: A collaborative social networking service among User, location and trajectory. IEEE Data Eng. Bull. 2010, 33, 32–40. [Google Scholar]
- Ouyang, K.; Shokri, R.; Rosenblum, D. A non-parametric generative model for human trajectories. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
- Jiang, J.; Pan, C.; Liu, H.; Yang, G. Predicting Human Mobility Based on Location Data Modeled by Markov Chains. In Proceedings of the IEEE UPINLBS, Shanghai, China, 3–4 November 2016. [Google Scholar]
- Ma, Q.; Zhang, S.; Zhu, T. PLP: Protecting location privacy against correlation analyze attack in crowdsensing. IEEE Trans. Mob. Comput. 2016, 16, 2588–2598. [Google Scholar] [CrossRef]
- Ghinita, G. Privacy for Location-Based Services. Synth. Lect. Inf. Secur. Privacy Trust. 2013, 4, 1–85. [Google Scholar] [CrossRef] [Green Version]
- Nergiz, M.; Atzori, M.; Saygin, Y.; Guc, B. Towards Trajectory Anonymization: A Generalization-Based Approach. Trans. Data Priv. 2009, 2, 52–61. [Google Scholar]
- Chen, R.; Fung, B.; Desai, B.C.; Sossou, N. Differentially private transit data publication: A case study on the Montreal transportation system. In Proceedings of the ACM KDD, Beijing, China, 12–16 August 2012. [Google Scholar]
- Krumm, J. A survey of computational location privacy. Pers. Ubiquitous Comput. 2009, 13, 391–399. [Google Scholar] [CrossRef]
- Poulis, G.; Skiadopoulos, S.; Loukides, G.; Gkoulalas-Divanis, A. Apriori-based algorithms for km-anonymizing trajectory data. Trans. Data Priv. 2014, 7, 165–194. [Google Scholar]
- Shokri, R.; Stronati, M.; Song, C. Membership inference attacks against machine learning models. In Proceedings of the IEEE SP, San Jose, CA, USA, 22–24 May 2017. [Google Scholar]
- Lee, B.; Oh, J.; Yu, H.; Kim, J. Protecting location privacy using location semantics. In Proceedings of the ACM SIGKDD, San Diego, CA, USA, 21–24 August 2011. [Google Scholar]
Parameter | Setting |
---|---|
Markov steady distribution | |
M | Markov transfer matrix |
or | Probability distribution of state i |
or | Transition probability from state i to state j |
T | Time partitions set |
t | Timestamps |
S | Location set |
Locations |
Indicators | Mean | Standard Deviation |
---|---|---|
Trajectory Length (GPS locations) | 2134 | 380.850 |
Trajectory Time-span (hours) | 4.6 | 1.088 |
The Number of Trajectories per User | 157.8 | 17.210 |
The Time-span of Trajectories per User (months) | 6.2 | 1.166 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qiu, G.; Shen, Y.; Cheng, K.; Liu, L.; Zeng, S. Mobility-Aware Privacy-Preserving Mobile Crowdsourcing. Sensors 2021, 21, 2474. https://doi.org/10.3390/s21072474
Qiu G, Shen Y, Cheng K, Liu L, Zeng S. Mobility-Aware Privacy-Preserving Mobile Crowdsourcing. Sensors. 2021; 21(7):2474. https://doi.org/10.3390/s21072474
Chicago/Turabian StyleQiu, Guoying, Yulong Shen, Ke Cheng, Lingtong Liu, and Shuiguang Zeng. 2021. "Mobility-Aware Privacy-Preserving Mobile Crowdsourcing" Sensors 21, no. 7: 2474. https://doi.org/10.3390/s21072474
APA StyleQiu, G., Shen, Y., Cheng, K., Liu, L., & Zeng, S. (2021). Mobility-Aware Privacy-Preserving Mobile Crowdsourcing. Sensors, 21(7), 2474. https://doi.org/10.3390/s21072474