*Article* **Perturbed-Location Mechanism for Increased User-Location Privacy in Proximity Detection and Digital Contact-Tracing Applications**

**Elena Simona Lohan 1, Viktoriia Shubina 1,2,\* and Dragos, Niculescu 2**

	- 060042 Bucharest, Romania; dragos.niculescu@upb.ro

**Abstract:** Future social networks will rely heavily on sensing data collected from users' mobile and wearable devices. A crucial component of such sensing will be the full or partial access to user's location data, in order to enable various location-based and proximity-detection-based services. A timely example of such applications is the digital contact tracing in the context of infectious-disease control and management. Other proximity-detection-based applications include social networking, finding nearby friends, optimized shopping, or finding fast a point-of-interest in a commuting hall. Location information can enable a myriad of new services, among which we have proximity-detection services. Addressing efficiently the location privacy threats remains a major challenge in proximitydetection architectures. In this paper, we propose a location-perturbation mechanism in multi-floor buildings which highly protects the user location, while preserving very good proximity-detection capabilities. The proposed mechanism relies on the assumption that the users have full control of their location information and are able to ge<sup>t</sup> some floor-map information when entering a building of interest from a remote service provider. In addition, we assume that the devices own the functionality to adjust to the desired level of accuracy at which the users disclose their location to the service provider. Detailed simulation-based results are provided, based on multi-floor building scenarios with hotspot regions, and the tradeoff between privacy and utility is thoroughly investigated.

**Keywords:** location privacy; perturbation mechanism; proximity detection; digital contact tracing; multi-floor areas

### **1. Introduction and Problem Statement**

People are increasingly interconnected through their wireless devices, such as smartphones, smartwatches, and other wearable devices. Most of such devices are already capable of localization and sensing, either through Global Navigation Satellite Systems (GNSS) chipsets in outdoor scenarios or through IEEE802.11\* (e.g., WiFi), Ultra-Wide Band (UWB), or Bluetooth Low Energy (BLE) chipsets in indoor scenarios. Many future wireless standards will also make localization and sensing as a part of the system design, such as emerging Sixth generation of cellular communications (6G) cellular communications [1], IEEE802.11bf WiFi upcoming standard [2], and UWB chipsets incorporated in modern smartphones [3].

Proximity-detection services based on wireless signals, and in particular based on BLE, have gained a significant interest in the past two years as they are enabling digital contract-tracing techniques [4] shown to be relevant in the context of COVID-19 disease managemen<sup>t</sup> [5,6]. Magnetic-field proximity detection solutions have also been recently proposed in the context of digital contact tracing, for example, in [7].

Digital contact tracing is an approach that has been built according to the privacy-bydesign concept to augmen<sup>t</sup> the manual ways of tracing the COVID-19-disease spread. By

**Citation:** Lohan, E.S.; Shubina, V.; Niculescu, D. Perturbed-Location Mechanism for Increased User-Location Privacy in Proximity Detection and Digital Contact-Tracing Applications. *Sensors***2022**, *22*, 687. https://doi.org/ 10.3390/s22020687

Academic Editors: Suparna De and Klaus Moessner

Received: 11 December 2021 Accepted: 14 January 2022 Published: 17 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

design, mobile and wireless gadgets equipped with BLE chipsets can transmit and receive anonymized signals with timestamps from nearby devices. This concept has become handy for digital contact-tracing purposes in the past year, since the BLE is a short-range technology that is particularly suitable for estimating close-range distances (e.g., less than 2 m) of the mobile phone users who crossed their paths. The BLE data with temporary identifiers, Received Signal Strength (RSS) values, and the timestamps of the encountered phones are therefore converted into the distance and time spent in proximity. Furthermore, there is a taxonomy [6,8] of centralized and decentralized decision-making approaches to handle data processing and inform the users about the risk of being exposed to the virus.

In the centralized approach [6,9], the logs from the mobile phone (or wearable bracelet) are encrypted and transferred to the cloud with a certain periodicity (e.g., once a day). Therefore in cases where the users opt-in to the protocol, the centralized server estimates the risk of being exposed and conveys this risk to the users. The majority of centralized approaches follow the data minimization principle and request to upload only relevant data, such as the temporary or ephemeral identities of the users who stayed within certain proximity for the time exceeding the set threshold. As an outcome, all computations for the risk scoring are made on the server-side, and the users only receive the notifications.

A different approach, known as decentralized or federated, delegates the risk scoring to own mobile devices or user edge devices, considering the logs are stored locally. Google and Apple adopted the consequent framework in their jointly designed Exposure Notifications protocol described in [10]. Here, only infected users, once confirmed being tested positive, upload the data to the cloud, whereas the rest of the users' devices download the data from the server and perform the risk estimates locally on their devices. The latter approach assumes that all data shared with the centralized server is subject to the user's consent.

As subjectively deemed in [6], based on end-user surveys, the users are more likely to perceive the decentralized decision-making approach as a better fit to preserve their location privacy due to the fact that the data is stored locally (typically for up to 21 days, unlike the server-side storage which can be much longer). However, there is no significant threat to the users' sensitive information in the centralized approach where the logs are encrypted and securely saved on a trusted server. The above-mentioned digital contacttracing example demonstrates that the location privacy concerns take place in the context of sensitive information, such as one's whereabouts and identities of encountered contacts.

Location Privacy-Preserving Mechanisms (LPPM) intend to preserve the individual location privacy in scenarios where services request access to the users' spatial location [11]. Location-Based Services (LBS) that collect sensitive information of the users' locations, as described in the classification framework in [12], can benefit from implementing LPPM.

Other examples of proximity-based services are 'find-a-friend' applications [13] or other social-networking applications [14].

In all these proximity-based services, the utility of the services comes from a good detection probability (i.e., the probability to correctly detect two users in the vicinity of each other when they are neighbours, also known as sensitivity measure) as well as a low false-alarm probability (i.e., the probability of incorrectly detecting two users in the vicinity of each other when in fact they are far away). This utility is inherently in a tradeoff with the amount of location privacy that a user can have when disclosing his location.

In order to protect users' location privacy, many approaches have been proposed so far in the literature. For example, a comprehensive survey of location-privacy mechanisms has been recently provided in [15]. The authors in [15] divided the location-privacy mechanisms into three classes: the Geo-indistinguishability (GeoInd) class, the Local Differential Privacy (LDP) class, and private spatial-decomposition class. They also pointed out that the LDP mechanism is not directly applicable to location data, while the private spatial decomposition requires the presence of a trusted server.

Once LPPM have been implemented, it is necessary to evaluate their behavior and compare it with the initial state of the system. GeoInd refers to a privacy notion that preserves the user's precise location while revealing approximate geospatial area [16].

Furthermore, when a user disclose its location with a certain perturbation mechanism, this perturbation mechanism can yield GeoInd [17] if the traces of the user are disclosed with a certain radius and certain statistical distributions, such as when Laplacian or Gaussian random perturbations are applied to modify the true user location. The reported location will not reveal information to an adversary for distinguishing the ground truth location among neighboring devices [18].

The authors in [17], presented GeoInd as a possible notion to quantify privacy. They introduced the radius *r*, which corresponds to the level of privacy and showed that such radius is proportional to the location radius, i.e., the Euclidean distance between the true and perturbed locations. Consequently, the radius is increasing by adding controlled randomized (e.g., Laplacian) noise. The authors have encountered problems of discretization and truncation. In our paper we directly use the Euclidian distance between the true and perturbed locations as a measure of user location privacy and we study its tradeoff with the service utility.

Another location privacy-preserving approach in the literature, which is an adherent of Differential Privacy (DP), is the concept of the Private Spatial Decomposition presented in [19]. Private Spatial Decomposition refers to a gradient privacy-budget allocation scheme. The approach assumes a two-dimensional space and different privacy levels, and it is proved to achieve -differential privacy.

An additional aspect related to the location privacy is the choice of the privacy metric, which is still not unified in the current literature. Such a privacy metric serves to quantify the efficiency of a localization algorithm by exploring the privacy versus accuracy [20] or the privacy versus utility [21] tradeoffs. As above-mentioned, in this paper we measure the location privacy via the Root Mean Square Error (RMSE) between the perturbed location and the true user location.

The authors in [22] proposed a location-aware perturbation scheme for mobile environments, where the goal was to decrease the adversary's knowledge with added Laplacian noise. Using the Hilbert curve, each second location is projected on a map, thus reducing the overhead caused by the precision of the location estimates. To evaluate the performance and accuracy of the proposed algorithm, the authors in [22] used nearness, resemblance, and displacement metrics. As a common rule, lower levels of  correspond to a higher privacy budget and effectively lower accuracy. For example, in [22], when the  value reached 1.0, the number of points located within 1000 m of the actual positions were a high as 99.04 percent.

Albeit obfuscation mechanisms are growing in their popularity, they introduce errors to the localization system by altering the ground truth locations of the devices. Obfuscation mechanisms result in losing some of the performance, or in other words, the utility of the system. In [18], the authors designed a location obfuscation mechanism, where the GeoInd was satisfied. This work in [18] focused on achieving GeoInd for any pair of neighboring pairs of locations and they showed good results for privacy and utility in 2D spaces. Our work focuses on 3D spaced with multi-floor buildings.

To the best of our knowledge, studies investigating the optimal tradeoff between obfuscating or perturbing the user location (i.e., decreasing the granularity of the reported location) versus utility for proximity-detection applications are still not well explored in the current literature, especially when such a proximity-detection application is a digital contact-tracing solution. Moreover, multidimensional approaches, such as 3D scenarios, provide more freedom for the user to protect their location from an adversary and have not been studied a lot so far.

This paper proposes a new perturbation metric suitable for proximity-detection-based services and applications relying strictly on the relative distance between two users, but not needing absolute location information, offers a theoretical analysis of its properties, and demonstrates via extensive simulation-based results a very good tradeoff between privacy preservation and service utility. The proposed metric is based on a combination of mapping based on the argmax operator and Gaussian or Laplacian perturbations. For

comparative purposes, the argmax-based metric is also compared with another metric, based on an argmin operator and Gaussian or Laplacian perturbations, and we show that it has a much better utility-privacy tradeoff than the argmin-based metric. It is to be noticed that the proposed argmax-based metric is only useful in the context of proximity-based services, when only the relative distance between users is needed, but not their absolute location. By contrast, the argmin-based metric would preserve its utility also for other location-based services (in addition to the proximity-based ones), at the expense of lower privacy protection compared to the argmax-based metric.

The remainder of the paper is organized as follows: Section 2 overviews various mechanisms for preserving location privacy in the literature and offers a classification of these mechanisms. Section 3 introduces the two proposed perturbation mechanisms, one based on argmax operator, suitable only for proximity-based services and another one based on argmin operator, suitable for all kinds of location-based services, but with lower privacy preservation levels than the one based on argmax operator. Section 4 offers a mathematical analysis of the proposed argmax operator and proves that it is able to offer GeoInd between users. Section 5 presents detailed simulation results in a 4-floor building with users located both within certain hotspot areas and outside hotspot areas. The presented results are easily scalable to any number of floors. Various configurations, in terms of building size, hotspot density, etc., are analyzed, and detailed results are presented in terms of user privacy and service utility. Finally, Section 6 summarizes the main findings and presents the conclusions.

### **2. Classification of Location-Privacy Mechanisms**

A classification of location-privacy mechanisms from current literature is provided in Figure 1. The location privacy can be ensured by the server side, by the user side or can be applicable at both sides. A more elaborate explanation of each technique can be found in Table 1 and it is based also on the literature review provided in Section 1.

User-side location privacy mechanisms can be found for example in [23]. Privacypreserving mappings solutions are born from optimal mappings to preserve privacy against statistical inference [24,25]. Noise perturbation mechanisms based on various noise types, such as Laplace and Gaussian noises are discussed for example in [26,27]. Dummy-location generation has been applied, for example, in [28].

Server-side location privacy mechanisms relying on spatial cloaking and k-anonymity mechanisms are described, for example, in [29–32]. Unlike in our paper, the assumptions in [32] are that the users communicate their location to the server with high accuracy; in our paper we assume that the users have full control to their location and choose to disclose it to the server with moderate-to-low accuracy, according to the chosen perturbation mechanisms, as explained later, in Section 3.

Private spatial decomposition solutions are discussed for example in [19]. Mix-zones solutions are addressed for example in [33,34]. Secure transformations are conceptually close to the privacy-preserving mappings done at the user/client side and they are addressed for example in [35]. Server-side solutions involve the trust in the service provider and they are susceptible to attacks of the server databases.

A privacy-preserving method that can be applied both at server and user sides is the encryption of location data, via various encryption mechanisms [36–38]. Even if encryption/decryption costs are quite affordable by nowadays mobile devices and smartphones, the encryption/decryption studies for location privacy available in the current literature point out that a main drawback of this approach is the relatively high delay [37] introduced in the data encryption/decryption processes, delay which may be not tolerable for many proximity-based services.

**Figure 1.** Three-fold classification of location-privacy mechanisms: starting from the edge device,a.k.a. user side (including two parts of the proposed privacy-preserving technique), communicationpart used for transferring data packets, and server-side perspective including the cases where theusers'dataisaggregatedontheserver.

Our proposed solutions, described in the next section, is a combination of a privacypreserving mapping (two mappings provided) and a noisy perturbation (two noise distributions studied).


### **Table 1.** Overview of LPPM in the literature.

### **3. Proposed Perturbed Location Mechanism**

### *3.1. Scenario Definition, Hypotheses, and Preliminary Notations*

We adopt a scenario when user devices are equipped with some form of an indoor localization engine, e.g., a combination of cellular-based positioning, WiFi/BLE-positioning, and other smartphone sensors-based positioning (barometers, gyroscopes, accelerometers), etc., which is already the state-of-the-art of indoor positioning. We also assume that each user *u* can have full control of his/her location data, modeled here via a 3D-location vector **x***u* ∈ **B**. It is also assumed that the used can choose the perturbation level with which he/she disclose own location data to a service provider. Thus, the user devices are able to apply a local perturbation mechanism *<sup>M</sup>*(**<sup>x</sup>***u*), before broadcasting the user location data to a service provider. Such service provider can be, for example, a centralized digital contact-tracing server which computes, based on the available perturbed locations *<sup>M</sup>*(**<sup>x</sup>***u*) the relative distances between any two users in the building and compares them to a safety threshold *γ* (e.g., *γ* = 2 m). The server stores such information in a database, together with timestamps and hashed users identities and when a user *v* informs the server that he or she has been detected with COVID-19, the server is able to find the information about all other users *u* that were in the vicinity of user *v* in a certain time window. For simplicity, we drop the time index in our model and look at snapshot decisions. Thus, if ||*M*(**<sup>x</sup>***u*) − *<sup>M</sup>*(**<sup>x</sup>***v*)|| ≤ *γ*, user *u* is informed by the contact-tracing server that he or she has been a 'close contact'. Above, || · || is the square root of the Euclidean norm (or the distance between two vectors).

Another example of a service provider relying on such proximity detection is a provider of a 'find a friend' service. Again, users can install an application which transmits to the service provider the hashed identities of themselves and their friends, and the server is keeping track of the ||*M*(**<sup>x</sup>***u*) − *<sup>M</sup>*(**<sup>x</sup>***v*)|| distances, based on the perturbed location information transmitted by each user. If ||*M*(**<sup>x</sup>***u*) − *<sup>M</sup>*(**<sup>x</sup>***v*)|| ≤ *γ*, then the users *u* and *v* are informed that their friend is nearby, at a distance *γ*. Again, the threshold parameter *γ* can be user defined or server defined; most likely, for 'find-a-friend' application, *γ* can be higher (e.g., 5–10 m) than for a digital contact-tracing application.

Let us denote the perturbed 3D-location values via **y***<sup>u</sup>*, with **y***u* = *<sup>M</sup>*(**<sup>u</sup>***u*) ∈ **B**, with **B** ∈ R<sup>3</sup> being the building space, defined via a cube space with edges [*xmin xmax*] × [*ymin ymax*] × [*zmin zmax*], where *xmin*, *xmax*, *ymin*, *ymax*, *zmin*, *zmax* are the building edges (minimum and maximum, respectively) in the 3D space. It is assumed that the centralized digital contact-tracing server (which can be trusted or untrusted) has access to the building floor plans. It is also assumed that the server is dividing the whole building space into grid points **b** = [*bx*, *by*, *bz*] ∈ **B**3, for example as shown in Figure 2 and that the set of grid points {**b**|**b** ∈ **B** } is transmitted to all users in the building, e.g., via cellular or WiFi connectivity. The grid step Δ*s* is a parameter of the centralized server providing proximity-detection services or user digital contact tracing. With a Δ*s* step it means that *bx* for example can only take values in the interval [*xmin* : Δ*s* : *xmax*].
