Next Article in Journal
Artificial Intelligence-Based Decision Support System for Sustainable Urban Mobility
Previous Article in Journal
The Impact of Federated Learning on Improving the IoT-Based Network in a Sustainable Smart Cities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analyzing Mobility Patterns at Scale in Pandemic Scenarios Leveraging the Mobile Network Ecosystem

1
Department of Telematics Engineering, Universidad Carlos III de Madrid, 28911 Leganés, Spain
2
UC3M-Santander Big Data Institute (IBiDat), 28903 Getafe, Spain
3
International Computer Science Institute (ICSI), Berkeley, CA 94704, USA
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(18), 3654; https://doi.org/10.3390/electronics13183654
Submission received: 16 July 2024 / Revised: 20 August 2024 / Accepted: 12 September 2024 / Published: 13 September 2024

Abstract

:
The ubiquity and pervasiveness of mobile network technologies has made them so deeply ingrained in our everyday lives that by interacting with them for very simple purposes (e.g., messaging or browsing the Internet), we produce an unprecedented amount of data that can be analyzed to understand our behavior. While this practice has been extensively adopted by telcos and big tech companies in the last few years, this condition, which was unimaginable just 20 years ago, has only been mildly exploited to fight the COVID-19 pandemic. In this paper, we discuss the possible alternatives that we could leverage in the current mobile network ecosystem to provide regulators and epidemiologists with the right understanding of our mobility patterns, to maximize the efficiency and extent of the introduced countermeasures. To validate our analysis, we dissect a fine-grained dataset of user positions in two major European countries severely hit by the pandemic. The potential of using these data, harvested employing traditional mobile network technologies, is unveiled through two exemplary cases that tackled macro and microscopic aspects.

1. Introduction

While human society has periodically faced pandemics across its entire existence, we now have more complex and effective measures to understand, and possibly combat, how a pandemic, such as the one of COVID-19, develops and spreads. Specifically, in the first two decades of the 21st century, we have witnessed the development of the mobile networking ecosystem, through which billions of sophisticated devices, referred to as smartphones, are permanently connected to the Internet, leveraging various mobile technologies. These smartphones are equipped with hardware chipsets for GPS, WiFi, and cellular technologies. Additionally, they run sophisticated Operating Systems (OSes) and apps, which leverage some of the previous technologies to generate a geolocation footprint of smartphones and, by extension, the people owning them.
The reported penetration of smartphones, over 80% in developed and developing countries [1], along with the ability to generate accurate geolocation footprints, make them an useful tool to help epidemiologists analyze the efficiency of different mobility restriction measures in fighting a pandemic. Other relevant socioeconomic aspects of mobility restrictions can be explored, such as their impact on the economy and the level of adherence to the restrictions in different countries. Therefore, smartphones can be really useful for generating geolocation footprints of the population by using various technologies (or combinations of them), such as the following:
  • Cellular Network Information: Mobile operators register the specific cell to which the smartphone is connected, providing a geographical footprint at the network cell level. Note that companies different from mobile operators can also generate maps of base station locations and register the cell id information on the device.
  • GPS Information: OS providers (for example, Google or Apple) as well as mobile app developers (with the permission of the user) can access the GPS location information. This allows to generate a geographical footprint at GPS-level.
  • WiFi Signal Information: There are both public and proprietary maps of the location of existing WiFi SSIDs. Using these maps, OS providers and/or mobile app developers, who have access to the WiFI SSIDs within range of the smartphone can use these data to generate a geographical footprint of the device.
With the different positioning methods and technologies and the use of advanced techniques (such as the use of machine learning techniques), several efforts have been made in the past to collect human mobility data and extract patterns of interest within these data to promote the development of location-based services and applications [2]. These efforts, also known as crowdsensing [3], can also employ other kinds of network deployment such as the UAV based ones [4].
During the COVID-19 pandemic, mobile operators [5], mobile OSes providers [6,7] and companies owning popular mobile apps [8] have publicly released reports (and some data) about aggregated mobility patterns of users in different countries. While such information is doubtlessly useful, it lacks the level of detail needed to conduct a deep research study. Epidemiologists, economists, and sociologists need more granular data to conduct detailed research on mobility restriction measures and their socioeconomic impacts. Telcos (owning mobile networks) and big tech companies (owning OSes and/or mobile apps) justify their reluctance to share sufficiently detailed data with researchers (and even governments) in terms of data protection and privacy.
In this paper, we leverage an alternative source of data for assessing people’s mobility, Software Development Kits (SDKs) integrated in mobile applications. Contrary to telcos and big techs, some of these providers such as Safegraph [9] or Predicio [10] have made their data available to researchers in the context of the COVID-19 pandemic. The contributions of this paper are twofold:
  • We discuss the alternative solutions for measuring human mobility using network data.
  • We leverage data from Predicio to showcase how this type of accessible data can be used to analyze citizens’ mobility patterns at different scales during the COVID-19 pandemic. We focus our analysis on two of the most affected countries: Spain and Italy. For this analysis, we leverage two state-of-the-art methodologies to (i) showcase how data coming from mobile terminals and mobile networks can be used in the context of a pandemic and (ii) discuss possible outcomes of this analysis.
We start in Section 2 with a discussion of the alternatives currently available for the collection of geolocation data, analyzing their characteristics, drawbacks, and opportunities in the context of a pandemic. Then, Section 3 presents an analysis of the evolution of the aggregated mobility of citizens in Spain and Italy, using the radius of gyration (ROG) [11] as a reference metric.
Section 4 presents an analysis of citizens’ mobility patterns at a finer scale relying on the concept of trajectory motifs [12]. To the best of our knowledge, this is the first analysis based on motifs of the citizens’ mobility evolution during the pandemic, which is a potent tool to understand micro-mobility patterns.
Section 5 considers some of the ethical questions raised by tracking users movements in this manner. Depending upon one’s views of the privacy issues at stake, our findings can provide hope for better public health, raise concerns about privacy, or both.
Our results show a dramatic reduction in citizens’ macro- and micro-mobility patterns in the most restrictive phases and how these patterns slowly get back to the pre-COVID-19 situation as the mobility restriction ease. This is a clear contrast with major studies performed in the past with similar data [13], which showed very high heterogeneity in the mobility patterns.

2. Measurement Alternatives and Recent Efforts

With the proliferation of portable and always-connected devices, monitoring the mobility patterns of very large populations offers several alternatives. As depicted in Figure 1, there are two major groups, one based on the user device and its capabilities (we will refer to these solutions as terminal-based solutions) and another one based on the network infrastructure (we will refer to them as infrastructure-based solutions). In this section, we briefly describe them, compare them in terms of their characteristics (granularity, accuracy, pervasiveness, completeness, and accessibility), and explain how they have been used in recent studies.

2.1. Infrastructure-Based and Passive Probes

Mobile networks keep track of the users’ approximate positions at any point in time to support basic functionality, such as user mobility or user roaming. This has been traditionally conducted with core network procedures, which kept track of the user to base station mapping through location updates or paging procedures. This rough location information does not require specific capabilities implemented in the terminal (i.e., the phone), and the network achieves a precision approximately equal to the size of the area covered by the antenna, which ranges from very small ones (micro-cells operating in cities) to larger ones (cells in rural areas). Network operators determine user locations using the serving antenna ID, which acts as a proxy for precise user location using the known positions of network antennas. By logging this information through network probes at the gateway level, operators can efficiently monitor and manage user mobility.
This information can be further refined using triangulation techniques available at specific network elements such as the MME [14]. To perform operations such as handovers, mobile terminals periodically report the received signal strength from other base stations. Therefore, by using triangulation techniques, the mobile network infrastructure can determine the position of the user with very high accuracy. However, the implementation of these techniques (e.g., the Location Management Function in 3GPP-release 16) imposes a very high overhead on the infrastructure and has led a significant number of mobile operators to not deploy them, as reported in experience papers [15].

2.2. Terminal- and User-Based Solutions

User localization can also be performed on users’ handheld devices, which are equipped with all sorts of sensors (including positioning ones) that can provide the precise location of the terminal, with an error of few meters. Additionally, the presence of wireless “virtual landmarks” such as base stations, Bluetooth beacons, or WiFi networks’ SSID can be used as georeferences.
All these sources enable the gathering of precise positioning information from the terminals. In the following list, we discuss these solutions according to the final purpose of the data analysis.
  • Added-value service through OS frameworks: Mobile OSes usually consider positioning information as one of the pieces of information available in the mobile terminal, and major mobile OSes offer application developers unified ways to gather such information.
    For instance, Android has offered the Location Framework API since its inception in version 1 (released in 2008), seeking to integrate location-based services into their applications. However, as technology evolves, as well as the demands for precision, efficiency, and feature diversity, Android now advocates for the adoption of its advanced Awareness API [16]. This API enables better performance in terms of accuracy, battery utilization and feature richness, and it requires the user’s exact location to be sent and processed in the cloud. A parallel evolution can be observed within Apple’s ecosystem, where their location framework, introduced alongside iOS version 2.0, has similarly undergone transformative advancements. While OS providers claim that all the processing is performed with the maximum privacy standards, and offer increased transparency to the users, they can still use it to provide information such as the Google [7] and Apple [6] COVID-19 Mobility reports.
  • Digital ads’ bid requests: An increasingly important aspect of the mobile ecosystem is the one related to advertising. Ad providers track and characterize user profiles to offer the most well-suited advertisements and increase the click ratio. Advertisers create profiles out of each user’s individual behavior, and users’ location is an important feature of users’ profiles. Usually, advertisers rely on products such as GeoIP databases to gather location knowledge out of end-user activities.
    However, the need for an increasingly more precise advertising characterization is forcing ad providers to gather location information (usually available within apps or browsers) and enrich the profiling with this information.
    Thus, advertising stakeholders have a relatively good picture of user mobility, obtained by analyzing the location information.
  • Geolocation SDKs: Another technology currently used for tracking positioning information from users is Geolocation SDKs. Such third-party libraries provide app developers with an alternative way of monetizing their user engagement by including into their codebase the functionality needed to fetch the users’ current position, typically with GPS granularity, and upload them to the cloud.
    Once there, these third-party providers, a.k.a, location data providers, analyze this information to provide location intelligence analytics or directly sell anonymized data to businesses or researchers for other kinds of analysis. This kind of dataset is the one we use in Section 3 and Section 4 for our analysis.

2.3. Trade-Offs

As discussed in Section 1, estimating people’s mobility patterns is paramount to counter pandemics such as COVID-19. Having the possibility of estimating mobility patterns with the highest precision (in space and in time) can help analysts and decision-makers to understand the effect of already taken initiatives and simulate the extent of new ones. However, the techniques discussed above in this section present very diverse characteristics that make them complementary or particularly well suited for certain kinds of activities, as detailed next and summarized in Table 1.
  • Granularity: This is possibly the most critical factor, particularly from a spatial perspective. While infrastructure-based solutions are currently limited to tracking at the cell level, user-based methods can precisely pinpoint the location of each user at any point in time. Without relying on costly triangulation techniques, user-based approaches offer spatial resolutions ranging from tens to hundreds of meters, enabling highly accurate location tracking.
    Also, time granularity may play a role. Due to the high pervasiveness of cellular technology, logging user mobility at scale is a challenging task that can generate huge amounts of data. Thus, such reporting (especially if performed at the network core) can be aggregated in batches to increase scalability with a price in terms of time resolution. This aspect is particularly important for passive measurements, which may have many events to be logged. Still, it also has relative importance for the user-based solutions. However, the intrinsically human nature of the interaction (i.e., usually events are recorded upon a human-to-device input) makes this factor less important.
  • Accuracy: Infrastructure-based solutions are very accurate. Location errors are negligible due to the difficulty of spoofing the point of attachment to the network. User-based solutions, instead, offer a very heterogeneous accuracy level. Firstly, they depend on the quality of the GPS signal, which may be bad indoors. Then, such location reporting can be biased by using mock locations or even with GPS spoofing techniques. Still, such errors could be mitigated during data processing, or improved by using other terminal information such as the one coming from the WiFi/Bluetooth network scans.
  • Pervasiveness: Infrastructure-based solutions monitor the entire population in a developed country, as the penetration rate is beyond 100%. By joining the major telco operators in a country, a trustworthy picture of the mobility patterns can be achieved, although with the limitations in granularity discussed above.
    Similar considerations may apply to OS frameworks, as the major developers also have the totality of the market share, although joining the databases may be challenging. The other solutions (ads or SDKs) have a lower pervasiveness, as they depend on the number of installations that mobile apps that embed the SDK obtain, or the number of visits that pages that show advertisements obtain. Finally, WiFi/Bluetooth based solutions are limited to the areas they can cover.
  • Completeness: Infrastructure-based measurements obtain the full view of user mobility practically at any point in time if the mobile terminal is switched on, as control messages are sent frequently enough to provide a continuous view of the mobility trends (at the price of a large amount of generated data). User-based solutions, instead, heavily depend on the interactions between the human and the device. The trajectories generated by these techniques are more event-based (e.g., check-ins) rather than a continuous flow, although frameworks that are deeply embedded into the OS (e.g., maps services) may make this interaction more fluid.
  • Accessibility: While all the previously discussed factors are relevant, even the best dataset is useless if it is not available to the right people. The previous subsections show that infrastructure-based data offers the largest pervasiveness with a poorer granularity, whereas OS frameworks offer the best trade-off between pervasiveness and granularity. Unfortunately, none of these data have been made available to the research community, even to fight the COVID-19 pandemic. Big tech companies and telcos have shared only aggregate data [8] which are not enough to perform the required mobility analysis. Instead, location data providers, such as Predicio (the one we use in this paper), have released fine granular data for research purposes in the context of COVID-19. In this paper, we showcase that despite the limitations in the pervasiveness of these types of data compared to infrastructure-based data or OS frameworks, they are valid for conducting both macro- and micro-mobility analyses. Both infrastructure- and terminal-based solutions shall enforce the highest privacy standards. However, while the data generated by infrastructure-based solutions are a side product for the correct operation of the, e.g., mobile networks, the data collected from terminals (hence end-users) shall enforce the highest transparency, as is the case for the data used in this work.

2.4. Current Efforts

Since the beginning of the COVID-19 pandemic, a plethora of works dealing with the analysis of its impact on society has been published. Among other metrics that have been considered in the literature, mobility is one of the most important. Human mobility has always been considered a reliable proxy for assessing the spread of viruses. This has also been corroborated for the COVID-19 pandemic: some authors [17,18] showed how the overall population mobility metrics have a deep correlation with the dynamics of the COVID-19 spread, while the authors of [19] showed that this correlation also has a role for reducing the infection probability. Additionally, the authors of [20] supported the importance of sharing at least aggregated mobility data to help combat COVID-19 based on the evidence.
Moreover, the work in [21] demonstrated that the pandemic had an impact not only on the way people move but also on the way we interact with a digital environment. One example of this change is in the way people interacted during COVID-19 in WhatsApp compared to the pre-pandemic period, as reported by [22]. Another example is based on YouTube content creation which was directly related to the mobility restrictions as reported by [23]. In addition, other researchers analyzed the changes in other aspects such as in commuter behaviors, like bike sharing, reported in this study [24]. These findings underscore the multifaceted nature of the effects of the pandemic on various aspects of human behavior and societal systems.
However, as also discussed previously, the availability of large-scale mobility data gathered with sufficient quality is very scarce, and only very few works leverage them. For instance, [25] discusses how mobility changed in Japan according to the restrictions that were imposed by the government, finding very similar patterns to the ones happening in Europe. Our work is similar in spirit to this one; however, both works show significant differences in the data and analytical methods used. The work in [25] only considers one country and the city of Tokyo, while ours is spanning two entire countries for a larger period of time.

3. Large-Scale Mobility Metrics

To show the capability of computing useful metrics that can help to counter the effect of users’ mobility, in the context of a pandemic, we analyze a dataset from Predicio, a geolocation SDK provider, obtaining a view on the mobility in two major European countries before, during, and after the first wave of the COVID-19 pandemic.

3.1. Data and Methodology

The data used in this paper have been provided to us by TAPTAP Digital [26] (with explicit consent from Predicio) in the context of the TAPTAP Digital-UC3M Chair [27] to develop aggregated human mobility analyses in the context of the COVID-19 pandemic.
The data we analyze span Spain and Italy for several months in 2020 (up to September for the Spanish case and May for the Italian). In total, we analyze more than 20B geolocated points (14B in Spain and 6B in Italy), referring to more than 500 k anonymous (average) daily users in Spain and 450 k in Italy.
Given the intrinsically on–off nature of the data, as discussed in Section 2, we filter the dataset to remove users that were just sporadically recorded. This has a resulting trade-off in the number of users that match the filter and the precision of the resulting metrics. After extensive trials, we select users with at least 5 points over a timespan of 8 hours in the day. This results in the following dataset cardinality: 437 k (average) daily users for the Spanish dataset, and 207 k (average) daily users for the Italian. We process the locations grouped by h3 [28], which assigns a unique identifier to hexagons of the Earth’s surface with different spatial resolutions. This facilitates data processing and minimizes the number of data points for the analysis performed in this paper.
First, we analyze how restrictive measures in different countries affected human mobility. For this purpose, we analyze the full-time series to understand people’s habits before and during the pandemic phase through a methodology which has been extensively used in the literature, even in works evaluating the effect of the pandemic in contexts outside of mobility [5]. We gain knowledge from the data by quantifying the extent of individual human trajectories measuring their ROG [11]. With this metric, we can characterize the bulk of users’ movements by the average traveled distance over a specific time window t. Given the typical circadian rhythms of humans (day and night cycle, spanning the entire day), we set t to 24 h. This metric is formally computed as follows:
r g a = 1 N c a ( t ) i = 1 N c a ( r i a r c m a ) 2
where r i a is the i = 1 N c a location recorded for user a and r c m a is its center of mass, namely
r c m a = i = 1 N c a ( r i a ) N c a
Hence, for each day in the dataset, we computed the ROG of the users that passed the filter, obtaining an aggregated mobility metric. A very low value (e.g., equal to 0 or the precision of a GPS device) indicates that a user was static throughout the day, while a higher one (e.g., 200 m, 2 km, 10 km) suggests that the user moved along the neighborhood, the district, or the city.
With ROG we can effectively quantify the dispersion of an individual’s movement within a specific period, providing a robust measure of mobility, allowing us to analyze the changes due to the mobility disruption. The relative measure of ROG, also allows us to better quantify such changes in contrast to absolute metrics such as trajectory lengths or trajectory data points. In the following section, we discuss the obtained ROG statistics for the two analyzed countries, to understand how human mobility has been affected by the COVID-19 pandemic.

3.2. Mobility Time Dynamics

In this subsection, we analyze the median ROG overtime during the first months of 2020 on a daily basis. Our data span the first months of the year, when no mobility reduction measures were in place, up to the end of the summer, when at least in Europe, the pandemics’ first wave peak was already past, and some relieving measures were already taken.
To better understand the temporal dynamics of the imposed regulations, we gathered data from the Oxford University Government Response Tracker [29]. Although there have been some discrepancies in the extent and the details of the regulation taken by each of the analyzed countries, we summarized them into four main phases: (i) pre-COVID-19, when restrictions were not present or affected just a minority of the population (e.g., the lockdown of a specific region), (ii) lockdown, when human mobility was forbidden, and just specific travel purposes were allowed, (iii) reopening, when some non-essential activities were re-allowed, and (iv) new normal, when most mobility restrictions were removed (data available only for Spain).
Figure 2 shows the median ROG over time. We can observe that Italy and Spain followed a very similar behavior, with the only difference being that the lockdown measure took place in Italy 7 days before Spain. The human mobility dropped to almost 0 in both countries, bringing down the median ROG from 1.2 km and 0.6 km to 0.01 km and 0.007 km in Italy and Spain, respectively.
After the lockdown phase, mobility started to grow again in both Italy and Spain in the reopening phase, achieving median ROG values equal to 0.18 km and 0.16 km at the end of the reopening phase, respectively. Similar mobility levels to pre-COVID-19 period were not recovered in Spain until September 2020, three months after the new normal phases started. Specifically, the median ROG in September 2020 topped up at almost 0.5 km, while it was at 0.6 km during the pre-pandemic phase. This indicates that habits changed quite drastically, and the mobility limitations impacted people well beyond the pure regulatory points of view.
As discussed in Section 2, one of the main drawbacks of user-based solutions is their completeness, especially for data coming from SDKs that require interactions with applications and large user bases, such as the one analyzed in this paper.
Hence, to validate our results, we have downloaded the dataset published by Apple that provides an aggregated daily mobility index [6]. In Figure 3, we show the linear regression between the daily time series of the median mobility index gathered from the Apple dataset in Spain and ours. We observe a Pearson correlation coefficient of 0.69 between both datasets, which hints at a strong correlation of the results obtained with our methodology through the ROG against the Apple mobility index showing comparable precision, at least for the high-level outcomes, with an OS framework-based solution.

3.3. Mobility Extent Dynamics

Such strict regulation on user mobility aimed to reduce the probability of coming in contact with an infected person and avoid spreading the disease further in the case of being asymptomatic. We measure the effectiveness of these measures by analyzing the full set of the ROG values in the two countries, having a more in-depth view of how the regulations affected user mobility.
Thus, we show in Figure 4 the quantile–quantile (Q–Q) plot of the average ROG in the two selected countries during the pre-COVID-19 (x-axis) and during (y-axis) lockdown phases. The bulk of the reduction is concentrated into two parts: short movements (as the first percentile drops from 11 km to 6 km in Spain and from 10 km to 4 km in Italy) and the central (25th to 75th percentiles) movements. For instance, the 60th to 75th percentile yielded a substantial increase in the ROG during the pre-COVID-19 period in Italy (from 12 km to 13.5 km), which corresponds to a negligible increase during the lockdown phase (from 5.8 km to 6 km).
We also notice that there is a long tail of the ROG, which is distributed as a power law for the two countries (this is also consistent with the findings of [11]). Such a long tail (e.g., the 90th percentile of the COVID-19 ROG in Spain is around 12 km, while the 99th reaches almost 14 km) represents a non-negligible amount of people that had to travel longer, most likely for working purposes (note that, even during the lockdown phase, essential activities such as public transportation or goods carriers were not forbidden in all the analyzed areas).
In general, the analysis shows a drastic reduction in the radius of gyration in both countries that exhibit very similar behavior. With a discount on the pre-COVID-19 phase, with people in Italy traveling shorter daily distances, the effect on both countries was comparable.

4. Micro-Mobility

As discussed in Section 2, user-based measurements such as the ones analyzed in this paper may achieve a very high precision (basically comparable to the geopositioning system used) that allows for creating fine-grained statistics on user mobility.
In this section, we propose an example of the potential analysis that can be performed using SDK-based data. Specifically, we examine the possibilities of extracting micro-mobility patterns in the context of a pandemic.

4.1. Motifs

To show the capability of user-based location data, we mutated the methodology followed in [12] by computing motifs from the selected user trajectories. Motifs are an abstraction of the daily mobility pattern of users in a dataset and may be useful to understand fine-grained user behavior beyond coarser metrics such as the ROG analyzed in Section 3. Motifs are a graph representation of people’s trajectory, where locations with a considerable amount of time spent are considered as nodes of an undirected graph, and the movements between them are considered as edges. With an analysis of motifs, the travels of people can largely be characterized, as demonstrated by the authors of [30].
Therefore, for this analysis, we focus on a specific urban area available in our dataset, the region of Madrid, and compute the motifs for all the users that have their homes there. As already conducted for the ROG computation, we first filter the dataset to retain only users who exhibit spurious behavior. We set the same threshold as for the macro-mobility study. This is 5 points per day with a timespan of 8 hours on the same day. With these filters, we have 500 k distinct total users for the Madrid subset, with an average daily basis of 20 k users. Then we compute the motifs as follows:
  • We compute the nodes of the motifs graphs by retaining only the locations where people spend a non-negligible amount of time, called stay-points. For this step, we apply the stay-point location algorithm available in the scikit-mobility library [31], which compresses a trajectory into a sequence of stay-points, computed using the DBSCAN algorithm, merging all observations belonging to a single trajectory that are closer than 200 m within a time window of 20 min. In this way, we isolate locations where contact may have actually taken place. Stay-points are calculated for each user on a daily basis.
  • We detect possible recurrent nodes (i.e., stay-points that are visited more than once in each trajectory), we discretize stay-points by using the Uber-h3 [28] spatial tessellation, which assigns a unique identifier to hexagons on the Earth’s surface with different spatial resolutions. We use the spatial resolution index 9, which corresponds to hexagons with an edge of approximately 120 m, an area comparable to the one used by the stay-point location algorithm. At this point, each motif is a graph, the nodes of which are the h3 identifiers of the stay-point location, and the edges are the movements among them.
  • We finally compute the popularity of motifs by understanding if two users’ trajectories yielded to the same graph structure, hence computing the isomorphism between them. For instance, two user trajectories that only have two nodes (e.g., two stay-points) with a back and forth behavior account for the popularity of the same motif.
We next analyze the motifs’ popularity in our dataset, computing them on a daily basis.

4.2. Motifs Popularity

Following the methodology described in the previous subsection, we computed the popularity for each of the motifs we found in the analyzed data. The popularity is computed by averaging the cardinality of each motif during each of the four phases that characterized the regulatory framework in Spain [29] introduced in Section 3: pre-COVID-19, lockdown, reopening (i.e., when mobility was gradually re-allowed), and new normal (i.e., the plan enforced by the Spanish government for the business as usual behaviour under the continuing COVID-19 circumstances). In this analysis, we focused on Spain, as we had availability for all the periods.
We remark that in addition to it being an hexagonal split, we observed drastic behavior changes in the popularity of the time series of the different pattern popularity, which further corroborates our methodology. Results are depicted in Figure 5, where we show the popularity trends in the 4 time windows for the 12 most popular motifs, which account for 79.3% of the recorded motifs overall.
We observe that motifs with few stops during the day (e.g., A, B, E, F) experienced substantial growth with respect to the pre-pandemic phase (up to 13%), which is a logical consequence of the mobility reduction regulation, which fostered the adoption of teleworking and only allowed essential movements.
Consequently, the popularity of more convoluted motifs, such as G or H, dropped down to almost negligible levels. Even in the new normal phase, the micro-mobility patterns had only partially returned to the pre-COVID-19 situation.
While motifs B and F (i.e., the ones that just recorded one stay point, either without any movement or without other noticeable stops) lost most of the popularity gained during the lockdown phases, other ones such as A, D, and F kept almost all their popularity. Interestingly, these motifs involve two or three stops, indicating that even with no hard restrictions in place, people retained simpler daily routines (e.g., home–work, home–school, or home–supermarket) as a distinctive sign of the new situation.
These outcomes can help to understand how the mobility restriction countermeasure effectively impacted the population in a short time window (e.g., on a daily basis): more complex motifs involve social gathering locations (e.g., malls or restaurants) with a subsequent higher contact probability with unknown people. This results in a higher (lower) opportunity for virus spread (back-tracing contacts of positive cases). Instead, simpler motifs involve mostly popular places such as home and work where encounters happen among known people. These have obvious benefits, such as the simpler and more effective back-tracing of contacts and easier enforcement of countermeasures such as wearing a mask or social distancing, e.g., in workplaces.

5. Ethical Considerations

In the context of this paper, we have used location data from SDKs installed in mobile phones. Phone users have more control over the apps they run than over cellar infrastructure and can opt out of SDK tracking by configuring privacy settings or not installing apps using such SDKs. On the other hand, users are often unaware of or misunderstand privacy settings and find them too intimidating to configure [32], and they may not be aware of all the SDKs they have installed [33]. While the use of SDK geolocation data is legal under current legislation when explicit consent is granted by the user, the difficulty of reading and comprehending privacy notices [34,35] may raise concerns about the use of such data for certain sensitive cases. In general, further research is needed into how smartphone users exert meaningful control over their privacy without bogging down the user experience. At the same time, control over personal information sometimes yields to public health concerns.
The use of technology for the control of COVID-19, as well as future pandemics, has been debated due to its ethical implications and affects on privacy. Medical ethics and public health ethics overlap, but differ in important ways, with medical ethics being more focused on individual concerns and public health on collective ones [36]. Even during a pandemic, individual phone users have privacy concerns about public health apps [37]. However, WHO guidelines state that, under certain circumstances, “informed consent is not ethically required” for public health surveillance and that other approaches for ensuring ethical data collection and use, such as transparency, good governance, training, and data security, should be used [38]. Anom analyzes the ethics of involuntarily collecting mobile phone data for COVID-19 tracking [39]. He concludes that it is ethical by pointing out that most considered far more invasive control methods, such as lockdowns, ethical [39], but admits that the case is stronger under consequentialist theories of ethics than under deontological ones [39]. All these works consider the nuances and balance competing concerns, such as the effectiveness of the surveillance, the virulence of the disease, and the minimization of data collection and use. We conclude that whether deploying the surveillance methods tested herein is ethical depends upon the details of the deployment.
The use of granular mobility data raises significant privacy and ethical concerns. However, in this paper, we have followed best practices to treat the referred sensitive data. The methodology and analyses described in this paper were approved by the Data Protection Officer (member of the Ethical Committee) of Universidad Carlos III of Madrid, confirming their compliance with the GDPR [40]. Moreover, the data provider enforced transparency in the collection of user location data [10], which is considered personally identifiable information (PII) by the GDPR. Finally, we only analyzed aggregated location data for the purpose of this paper.

6. Conclusions

This paper explores the potential of leveraging the mobile network ecosystem—from user devices to cloud servers, including physical infrastructure like base stations—as a valuable resource for understanding large-scale population mobility patterns. Such insights have proven crucial during crises like the COVID-19 pandemic. We argue that major network operators can easily collect and share these data with researchers and health authorities to optimize pandemic response strategies.
To demonstrate this potential, we employed two established metrics: ROG and location motifs. Using data from Spain and Italy, we analyzed population mobility trends at both the macro and micro levels. While our study only provides a first iteration on the problem, it highlights the ability of mobile network data to reveal various mobility patterns. These data could also be used to identify crowded areas, track disease spread, and assess the effectiveness of mobility restrictions.
Our findings suggest that mobile network data are a powerful tool for monitoring and understanding the impact of pandemic-related mobility changes. By analyzing location motifs, we demonstrated the depth of insights that can be derived from these data.

Author Contributions

Conceptualization, R.C. and Á.C.; methodology, P.C. and M.G.; software, P.C.; validation P.C., M.G., R.C., Á.C. and M.C.T.; writing—original draft preparation, P.C.; writing—review and editing, M.G., R.C., Á.C. and M.C.T.; supervision, R.C.; funding acquisition, R.C. and Á.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the Spanish Ministry of Economic Affairs and Digital Transformation and the European Union-NextGenerationEU through the UNICO 5G I+D project 6G-RIEMANN; TAPTAP Digital-UC3M Chair in advanced AI and data science applied to advertising and marketing; the framework of the Recovery, Transformation and Resilience Plan funds, financed by the European Union (Next Generation) through the grant Análisis y miTIgación de riesgos de seguridad y privaCIdad a asociados a la explotación de datos PersonAles en OTTs (ANTICIPA); and the Ministerio de Ciencia e Innovación/AEI and the European Union-NextGenerationEU through the project Towards an Auditable Internet (AUDINT) under the grant number TED2021-132076B-I00. We also gratefully acknowledge funding support from the National Science Foundation (Grant 2055772). The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the NSF, the U.S. Government, or our employers.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from Predicio and are available from the authors with the permission of Predicio.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
COVID-19Coronavirus Disease 2019
MMEMobility Management Entity
OSOperating System
SSIDService Set Identifier
ROGRadius of gyration
SDKSoftware Development Kit
GPSGlobal Positioning System
GDPRGeneral Data Protection Regulation

References

  1. Deloitte. Global Mobile Consumer Trends, 2nd Edition. Mobile Continues Its Global Reach into All Aspects of Consumers’ Lives. Available online: https://www2.deloitte.com/content/dam/Deloitte/us/Documents/technology-media-telecommunications/us-global-mobile-consumer-survey-second-edition.pdf (accessed on 15 July 2024).
  2. Toch, E.; Lerner, B.; Ben-Zion, E.; Ben-Gal, I. Analyzing large-scale human mobility data: A survey of machine learning methods and applications. Knowl. Inf. Syst. 2019, 58, 501–523. [Google Scholar] [CrossRef]
  3. Yang, Y.; Zhang, B.; Guo, D.; Wang, W.; Nie, J.; Xiong, Z.; Xu, R.; Zhou, X. Stochastic Geometry-Based Age of Information Performance Analysis for Privacy Preservation-Oriented Mobile Crowdsensing. IEEE Trans. Veh. Technol. 2023, 72, 9527–9541. [Google Scholar] [CrossRef]
  4. Han, Z.; Yang, Y.; Bilal, M.; Wang, W.; Krichen, M.; Alsadhan, A.A.; Ge, C. Smart Optimization Solution for Channel Access Attack Defense Under UAV-Aided Heterogeneous Network. IEEE Internet Things J. 2023, 10, 18890–18897. [Google Scholar] [CrossRef]
  5. Lutu, A.; Perino, D.; Bagnulo, M.; Frias-Martinez, E.; Khangosstar, J. A Characterization of the COVID-19 Pandemic Impact on a Mobile Network Operator Traffic. In Proceedings of the ACM Internet Measurement Conference, Virtual Event, 27–29 October 2020; pp. 19–33. [Google Scholar]
  6. Apple. Mobility Trends Reports. No Longer Reported. Available online: https://covid19.apple.com/mobility (accessed on 15 July 2024).
  7. Google. COVID-19 Community Mobility Reports. Available online: https://www.google.com/covid19/mobility/ (accessed on 15 July 2024).
  8. Meta. Data for Good. COVID-19. Available online: https://dataforgood.facebook.com/dfg/ (accessed on 15 July 2024).
  9. Safegraph. The Source of Truth for Places Data. Available online: https://www.safegraph.com/ (accessed on 15 July 2024).
  10. Predicio. Location-Based Behavior Intelligence. Available online: https://proptechzone.com/startups/predicio/ (accessed on 15 July 2024).
  11. González, M.C.; Hidalgo, C.A.; Barabási, A.L. Understanding individual human mobility patterns. Nature 2008, 453, 779–782. [Google Scholar] [CrossRef] [PubMed]
  12. Schneider, C.M.; Belik, V.; Couronné, T.; Smoreda, Z.; González, M.C. Unravelling daily human mobility motifs. J. R. Soc. Interface 2013, 10, 20130246. [Google Scholar] [CrossRef] [PubMed]
  13. Pappalardo, L.; Simini, F.; Rinzivillo, S.; Pedreschi, D.; Giannotti, F.; Barabási, A.L. Returners and explorers dichotomy in human mobility. Nat. Commun. 2015, 6, 8166. [Google Scholar] [CrossRef]
  14. Alsaeedy, A.A.; Chong, E.K. A review of mobility management entity in LTE networks: Power consumption and signaling overhead. Int. J. Netw. Manag. 2020, 30, e2088. [Google Scholar] [CrossRef]
  15. Calabrese, F.; Colonna, M.; Lovisolo, P.; Parata, D.; Ratti, C. Real-Time Urban Monitoring Using Cell Phones: A Case Study in Rome. IEEE Trans. Intell. Transp. Syst. 2011, 12, 141–151. [Google Scholar] [CrossRef]
  16. Google. Awareness API. Available online: https://developers.google.com/awareness (accessed on 15 July 2024).
  17. Kraemer, M.U.; Yang, C.H.; Gutierrez, B.; Wu, C.H.; Klein, B.; Pigott, D.M.; Du Plessis, L.; Faria, N.R.; Li, R.; Hanage, W.P.; et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 2020, 368, 493–497. [Google Scholar] [CrossRef]
  18. Rahman, M.M.; Thill, J.C. Associations between COVID-19 pandemic, lockdown measures and human mobility: Longitudinal evidence from 86 countries. Int. J. Environ. Res. Public Health 2022, 19, 7317. [Google Scholar] [CrossRef]
  19. Zhou, Y.; Xu, R.; Hu, D.; Yue, Y.; Li, Q.; Xia, J. Effects of human mobility restrictions on the spread of COVID-19 in Shenzhen, China: A modelling study using mobile phone data. Lancet Digit. Health 2020, 2, e417–e424. [Google Scholar] [CrossRef] [PubMed]
  20. Buckee, C.O.; Balsari, S.; Chan, J.; Crosas, M.; Dominici, F.; Gasser, U.; Grad, Y.H.; Grenfell, B.; Halloran, M.E.; Kraemer, M.U.; et al. Aggregated mobility data could help fight COVID-19. Science 2020, 368, 145–146. [Google Scholar] [CrossRef] [PubMed]
  21. Li, T.; Zhang, M.; Li, Y.; Lagerspetz, E.; Tarkoma, S.; Hui, P. The Impact of COVID-19 on Smartphone Usage. IEEE Internet Things J. 2021, 8, 16723–16733. [Google Scholar] [CrossRef] [PubMed]
  22. Seufert, A.; Poignée, F.; Hoßfeld, T.; Seufert, M. Pandemic in the digital age: Analyzing WhatsApp communication behavior before, during, and after the COVID-19 lockdown. Humanit. Soc. Sci. Commun. 2022, 9, 140. [Google Scholar] [CrossRef]
  23. Mejova, Y.; Kourtellis, N. Youtubing at home: Media sharing behavior change as proxy for mobility around covid-19 lockdowns. In Proceedings of the 13th ACM Web Science Conference 2021, Virtual Event, 21–25 June 2021; pp. 272–281. [Google Scholar]
  24. Pase, F.; Chiariotti, F.; Zanella, A.; Zorzi, M. Bike sharing and urban mobility in a post-pandemic world. IEEE Access 2020, 8, 187291–187306. [Google Scholar] [CrossRef]
  25. Yabe, T.; Tsubouchi, K.; Fujiwara, N.; Wada, T.; Sekimoto, Y.; Ukkusuri, S.V. Non-compulsory measures sufficiently reduced human mobility in Tokyo during the COVID-19 epidemic. Sci. Rep. 2020, 10, 18053. [Google Scholar] [CrossRef]
  26. TAPTAP Digital. Intelligence for Marketing. Available online: https://www.taptapdigital.com/ (accessed on 15 July 2024).
  27. Universidad Carlos III de Madrid. Research Portal. Chair TAPTAP DIGITAL-UC3M in Advanced AI and Data Science Applied to Advertising and Marketing. Available online: https://researchportal.uc3m.es/display/act538274 (accessed on 15 July 2024).
  28. Uber Engineering. H3: Uber’s Hexagonal Hierarchical Spatial Index. Available online: https://eng.uber.com/h3/ (accessed on 15 July 2024).
  29. Hale, T.; Webster, S.; Petherick, A.; Phillips, T.; Kira, B. Oxford COVID-19 Government Response Tracker (OxCGRT). Last Update. Available online: https://www.bsg.ox.ac.uk/research/covid-19-government-response-tracker (accessed on 15 July 2024).
  30. Cao, J.; Li, Q.; Tu, W.; Wang, F. Characterizing preferred motif choices and distance impacts. PLoS ONE 2019, 14, e0215242. [Google Scholar] [CrossRef]
  31. Pappalardo, L.; Simini, F.; Barlacchi, G.; Pellungrini, R. Scikit-Mobility: A Python Library for the Analysis, Generation and Risk Assessment of Mobility Data. arXiv 2019, arXiv:1907.07062. [Google Scholar] [CrossRef]
  32. Frik, A.; Kim, J.; Sanchez, J.R.; Ma, J. Users’ Expectations About and Use of Smartphone Privacy and Security Settings. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22, New Orleans, LA, USA, 29 April–5 May 2022. [Google Scholar] [CrossRef]
  33. Cox, J. Leaked Location Data Shows Another Muslim Prayer App Tracking Users. Vice. 2021. Available online: https://www.vice.com/en/article/muslim-app-location-data-salaat-first/ (accessed on 15 July 2024).
  34. Jensen, C.; Potts, C. Privacy Policies as Decision-Making Tools: An Evaluation of Online Privacy Notices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’04, Vienna, Austria, 24–29 April 2004; pp. 471–478. [Google Scholar] [CrossRef]
  35. McDonald, A.M.; Cranor, L.F. The Cost of Reading Privacy Policies. I/S A J. Law Policy Inf. Soc. 2008, 4, 543. [Google Scholar]
  36. Swain, G.R.; Burns, K.A.; Etkind, P. Preparedness: Medical Ethics versus Public Health Ethics. J. Public Health Manag. Pract. 2008, 14, 354–357. [Google Scholar] [CrossRef] [PubMed]
  37. Wang, T.; Guo, L.; Bashir, M. COVID-19 Apps and Privacy Protections from Users’ Perspective. Proc. Assoc. Inf. Sci. Technol. 2021, 58, 357–365. [Google Scholar] [CrossRef] [PubMed]
  38. World Health Organization. WHO Guidelines on Ethical Issues in Public Health Surveillance; World Health Organization: Geneva, Switzerland, 2017.
  39. Anom, B.Y. The Ethical Dilemma of Mobile Phone Data Monitoring during COVID-19: The Case for South Korea and the United States. J. Public Health Res. 2022, 11, 22799036221102491. [Google Scholar] [CrossRef] [PubMed]
  40. GDPR. Complete Guide to GDPR Compliance. Available online: https://gdpr.eu/ (accessed on 15 July 2024).
Figure 1. Pictorial representation of the measurement alternatives in the mobile network ecosystem.
Figure 1. Pictorial representation of the measurement alternatives in the mobile network ecosystem.
Electronics 13 03654 g001
Figure 2. The median (light) and 7-days moving average (dark) radius of gyration for Spain (top) and Italy (bottom).
Figure 2. The median (light) and 7-days moving average (dark) radius of gyration for Spain (top) and Italy (bottom).
Electronics 13 03654 g002
Figure 3. Scatter plot for the median radius of gyration time series and the median Apple mobility index.
Figure 3. Scatter plot for the median radius of gyration time series and the median Apple mobility index.
Electronics 13 03654 g003
Figure 4. The qq plot of the average radius of gyration in Spain (left) and Italy (right) before and during the lockdown phases.
Figure 4. The qq plot of the average radius of gyration in Spain (left) and Italy (right) before and during the lockdown phases.
Electronics 13 03654 g004
Figure 5. The recorded popularity for the top 12 motifs in Spain.
Figure 5. The recorded popularity for the top 12 motifs in Spain.
Electronics 13 03654 g005
Table 1. The qualitative comparison across the different discussed technologies.
Table 1. The qualitative comparison across the different discussed technologies.
TechnologyGranularityAccuracyPervasivenessCompletenessAccessibility
Infrastructure basedMedium, difficult to achieveHighHighHighVery Low
Terminal BasedOS frameworksHighMediumHighHighVery Low
Digital AdsHighMediumLowLowHigh
SDKHighMediumLowLowHigh
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Callejo, P.; Gramaglia, M.; Cuevas, R.; Cuevas, Á.; Tschantz, M.C. Analyzing Mobility Patterns at Scale in Pandemic Scenarios Leveraging the Mobile Network Ecosystem. Electronics 2024, 13, 3654. https://doi.org/10.3390/electronics13183654

AMA Style

Callejo P, Gramaglia M, Cuevas R, Cuevas Á, Tschantz MC. Analyzing Mobility Patterns at Scale in Pandemic Scenarios Leveraging the Mobile Network Ecosystem. Electronics. 2024; 13(18):3654. https://doi.org/10.3390/electronics13183654

Chicago/Turabian Style

Callejo, Patricia, Marco Gramaglia, Rubén Cuevas, Ángel Cuevas, and Michael Carl Tschantz. 2024. "Analyzing Mobility Patterns at Scale in Pandemic Scenarios Leveraging the Mobile Network Ecosystem" Electronics 13, no. 18: 3654. https://doi.org/10.3390/electronics13183654

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop