1. Introduction
Evacuation plans are very important tools to guide people in different emergency situations on how to evacuate. However, when a disaster occurs, even though there are predefined shelters such as schools, communal halls, libraries, and other buildings, in many cases, people have an instinctive feeling as to the direction of safety [
1]. They tend to move away from danger and towards destinations perceived as safe [
2]. In addition to that, the evacuation behaviors also depend on whether all shelters in the surrounding area can provide refuge for people at a reasonable evacuation distance and time immediately after a disaster [
3,
4]. The human behavior in different emergency contexts is explained by social and psychological theories and associated underlying cognitive and behavioral processes [
5,
6,
7]. Moreover, in some cases, depending on the intensity of a disaster, shelters may suffer damage themselves or fail to provide the safety and security needed, forcing people to move to places other than shelters [
8]. Therefore, when a disaster has struck, people in shelters do not represent 100% of the displaced population, which brings a challenge to government and humanitarian aid organizations when it comes to allocating resources for disaster response.
Knowing the spatial distribution of the displaced population (including people that moved to shelters and other places) is crucial for rapid and effective disaster response, which among other things aims to meet humanitarian needs such as food, clothing, public health, and safety [
9,
10]. Traditional methods of estimation of people displacement are based on population survey. For example, Hori et al. [
11] used the 2006 Louisiana Health and Population Survey (LHPS) to describe three distinct dimensions of displacement dynamics: in-migration, out-migration, and intra-parish movement in southern Louisiana after hurricanes Katrina and Rita, on 29 August 2005 and 24 September 2005, respectively. Gray et al. [
12] used large-scale survey data collected from respondents living in coastal areas of Indonesia before and after the 2004 Indian Ocean tsunami. Survey data were combined with satellite imagery and multivariate statistical analyses to map vulnerability to post-tsunami displacement across the provinces of Aceh and North Sumatra. In addition to that, the survey data were used to compare patterns of migration after the tsunami between damaged areas and areas not directly affected by the tsunami. Recently, governments and other humanitarian aid organizations have been using information about the spatial distribution of people in shelters provided by the Displacement Tracking Matrix (DTM) after surveying the affected populations [
13].
However, conducting a survey is time-consuming and infeasible to perform at scale and provides low coverage, i.e., can only cover the numbers for the population that are at the shelters. In addition, the information cannot be delivered in a timely fashion. For example, cyclone Idai struck Beira city on 14 March 2019, but the survey by DTM in close coordination with Mozambique’s National Institute for Disaster Management (INGC) was conducted between 2 and 13 May 2019, i.e., a month and half after disaster [
14]. The result showed that by the time the survey was conducted, some people had started to return to their areas of origin, which means that using this information government and other humanitarian agencies could not assist people when they needed it. Therefore, the spatial distribution of the displaced population in a timely fashion is necessary.
Many approaches have been proposed to model the people’s behavior during the evacuation and to estimate the spatial distribution of displaced people after a disaster to support response activities. For instance, Osaragi [
15] constructed different models to understand the behavior of individuals attempting to reach home on foot in the wake of a devastating Tohoku-Pacific Ocean Earthquake (2011). Hu et al. [
16] used the susceptible–infective-removal (SIR) model to understand the spread of disaster risk perceived among homeless victims and other disaster-affected people while considering the effects of psychological interventions on them. Yabe et al. [
17] proposed a novel framework to estimate evacuation hotspots after the Kumamoto earthquake using Global Positioning System (GPS) traces of smartphones collected by Yahoo Japan. The experiment showed promising results. However, the sample data (approximately 1% of the population from all over Japan) were very limited. To overcome sample bias, anonymized mobile Call Detail Records (CDRs) data have been proposed as an alternative source of information to estimate the spatial distribution of the population used as a base to estimate the displacement matrix after a disaster. For example, Kubíček et al. [
18] used mobile phone data to model population distribution at a fine spatio-temporal scale in the city of Brno, Czech Republic, and discussed the potential use of the proposed approach within selected emergencies. The analysis was based on the number of people visiting and transiting in each specific area. The results of the analysis were compared to census data and proved how the proposed method can improve the spatial granularity of the number of visiting and transiting in each specific area. Li et al. [
19] used large-scale mobile phone data to estimate fine-grained dynamic population distribution and high-resolution population map in Shanghai city, China, and analyzed spatio-temporal interaction of human movement. Recently, Wang et al. [
20] presented a systematic literature review paper showing the different applications of mobile phone data for emergency management. The review was based on the analysis of 65 related articles written between 2014 and 2019 from six electronic databases. Zhang et al. [
21] proposed a method for the estimation of finely-grained spatio-temporal human population density distributions using mobile CDRs. Moreover, Balistrocchi et al. [
22] discussed the suitability of mobile phone data to derive crowding maps. Through this study, characteristic exposure spatio-temporal patterns and their uncertainties were detected with regard to land cover and calendar period. However, the proposed approach still deserves verification during real-world flood episodes.
Bengtsson et al. [
23] used position data of Subscriber Identity Module (SIM) cards from the largest mobile phone company in Haiti (Digicel) to estimate the magnitude and trends of population movements following the Haiti 2010 earthquake and cholera outbreak. The experimental study was conducted in Port-au-Prince and proved that the geographic distribution of population movements from Port-au-Prince corresponded with results from a large retrospective, population-based UN survey. Lu et al. [
24] used the same data (from Digicel) to evaluate the predictability of population displacement after the 2010 Haiti earthquake. The authors found that both the travel distances and size of people’s movement trajectories grew after the earthquake. In addition to that, the results suggested that the predictability of people’s trajectories remained high and even increased slightly during the three months after the earthquake. Another study was conducted by Wilson et al. [
25]. In this study, the authors proposed a method for rapid and near real-time assessments of population displacement following the 2015 Nepal earthquake by analyzing the movement of 12 million anonymized mobile phone users. The result showed the evolution of population mobility patterns after the earthquake and the patterns of return to affected areas at a high level of detail.
However, the methods presented by Bengtsson et al. [
23] and Wilson et al. [
25] rely on defining the daily user’s home location as the last place he/she made a call from, which in many cases is not realistic since a person can have their last mobile activity before getting home. Furthermore, even while at home, a mobile activity can be placed in a cell-tower that is far away from the user’s home location due to overload in the nearest cell-tower. Moreover, each user is assigned a daily location at the district level based on the location that the corresponding user’s home cell-tower is in, without considering that the cell-tower can be shared among different districts. The aim of this research is to estimate the spatial distribution of displaced people after a severe cyclone using anonymized mobile phone data.
Therefore, in this research, were adopted some proven techniques from different studies such as study area tessellation [
26], home location estimation [
25,
27], areal interpolation [
28], and origin-destination estimation [
27,
29] to derive a method for the spatial distribution of displaced population after a disaster. This method contributes in three aspects, namely: (1) user’s home location (at cell-tower level) is based on the frequency of usage of mobile phone during the night time; (2) each user is assigned a home location (at the neighborhood level) considering the coverage area of the cell-tower that he/she was found to be living in; (3) to evaluate the accuracy of the proposed method, validation was introduced, which compares the displaced population estimated using CDRs with remote-sensing-based building assessment data and Displacement Tracking Matrix (DTM) from an individual questionnaire survey conducted after a severe cyclone in Beira city, central Mozambique, in March 2019. The proposed method can be used to derive a near-real-time displaced population matrix after a disaster, which in turn can support the response teams in their activities that, among others, aim to provide for the basic humanitarian needs of affected people.
The remainder of the paper is organized as follows.
Section 2 presents the study area and the methods used in this research.
Section 3 describes the data used in this research.
Section 4 presents experimental results of the proposed method and the validation of the estimated displacement matrix. Finally,
Section 5 concludes and presents future directions of the research.
4. Experimental Results, Validation, Discussion
This section presents the main results of the experimental study, describes the validation process carried out to evaluate the effectiveness of the proposed method, and discusses the results.
4.1. Experimental Results
After estimating users’ home locations (at cell-tower level) before and after the disaster, these data were aggregated to compute the population of mobile phone users in each cell-tower before and after the disaster.
Figure 8 presents the aggregated population obtained by summing out the distinct users in each cell-tower before and after the disaster.
From
Figure 8 it is possible to see the difference between the two aggregated mobile phone users’ maps (before and after the disaster). The first visible difference is that
Figure 8b comes with some blank areas (missing voronoi polygon), which represent the mobile phone cell-towers that were still out of order by the end of the study period (8 April 2019). Other visible differences can be seen in the central area of the city (area with dense cell-towers), where some cell-towers registered an increase in the mobile phone user population after disaster and others that decreased the number of anonymized subscribers. This was due to the fact that some of these areas were less affected, and therefore people moved-in, while other areas were highly affected, and people moved-out.
Based on the area of Voronoi polygon that falls within each neighborhood, a number of users who have to be assigned in each neighborhood administrative unit was calculated, and corresponding mobile phone users were randomly selected from mobile phone users who were found to be living in the Voronoi polygon before and after a disaster and assigned to the corresponding neighborhood. The aggregated assigned mobile phone users in each neighborhood before and after the disaster are presented in
Figure 9.
From
Figure 9, some interesting differences can be seen. For example, in the Nhangau neighborhood (1), there is a drastic change in the number of mobile phone users who were assigned to this area. This is because the main cell-tower that covers this area was out of service for the entire period after the disaster considered in this study, as shown in
Figure 8. However, since there are some other cell-towers that partially cover this neighborhood, it was possible to assign some mobile phone users to this neighborhood after the disaster. If the assignment of users was based on the coordinates of the cell-tower without considering the coverage area of each cell-tower as in Wilson et al. [
25], the Nhangau neighborhood would not have any user after the disaster. This is an example that clearly shows how important it is to consider the coverage area of the cell-tower in the analysis.
The assigned users to the administrative neighborhoods were then aggregated based on their home location neighborhood ID before and after the disaster, and the mobile phone users’ displacement matrix was derived.
Figure 10 shows the mobile phone users displacement matrix.
In
Figure 10, the diagonal northwest (NW)–southeast (SE) elements of the matrix represent the mobile phone users who were not displaced after disaster, i.e., the users who remained in their home neighborhood. The other elements in the matrix represent the mobile phone users’ flow from origin to destination. Some neighborhoods registered a high number of mobile phone users who remained in their origin areas, namely Chaimite, Chipanhara, Chota, Mananga, Munhava central, Ponta-Gea, and Tchonja. Among this neighborhoods, the highest number of non-displaced mobile phone users was registered in Ponta-Gea, which is justified by only 29% of damaged infrastructures in this area (the lowest damaged rate in entire Beira city) REACH [
42]. Moreover, it is possible to see that few mobile phone users moved to Ndunda and Vaz neighborhoods. This is because these two neighborhoods were among the most damaged ones (both with damaged buildings over 80%) REACH [
42].
The mobile phone users displacement matrix was then scaled up using the population data to obtain the actual flow among the origin-destination pairs.
Figure 11 presents the scaled-up displacement matrix.
In
Figure 11, the absence can be noted of three neighborhoods that were present in the mobile phone users displacement matrix (
Figure 10), namely Chamba, Nhaconjo, and Tchonja. The reason for this is the unavailability of population data for these neighborhoods in the map where these data were extracted, available in [
31].
4.2. Validation of Results
The validation process consisted of comparing the
inflow, remain, and outflow in each neighborhood and the percentage of damaged infrastructure derived from remote sensing techniques.
Figure 12 shows the aggregated inflow, remain, and outflow in each neighborhood.
From
Figure 12, it is clear that overall the number of people who remained in each neighborhood was less than the inflow (number of arrival in the neighborhood) and outflow (number of people who left the neighbor after the disaster). However, inflow and outflow vary from one neighborhood to another; i.e., in some cases, the inflow is greater than outflow (e.g., Ponta Gea, Alto data Manga, etc.), and in other cases, the inflow is less than outflow (e.g., Vaz, Esturo, Matacuane, etc.). These differences are supported by the idea that some neighborhoods were more highly affected than others REACH [
42], as shown in
Figure 13.
For example, Ponta Gea and Munhava Central neighborhoods were less affected (only around 29% and 30% of infrastructures were damaged, respectively), and many people probably moved into this neighborhood. On the other hand, Chota, Vaz, and Maraza neighborhoods were highly affected (over 85% of infrastructure was damaged), and therefore, people might have moved out to other neighborhoods, which justifies the high outflow in these areas. In addition to this, in some neighborhoods where the difference between inflow and outflow is minimal, which means that the number of people that entered the area was relatively equal to the people that left it, the damaged infrastructures were between 40% and 50% (e.g., Macurungo and Chipangara). However, there are cases where the inflow was relatively greater than outflow, but still, the areas were highly affected (e.g., Mananga and Matadouro neighborhoods). Similarly, there are cases where the inflow is relatively less than the outflow, but still, the areas were less affected (e.g., Esturro neighborhood). The first case might have to do with the availability of shelters; i.e., even though the area was highly affected, there were enough shelters for the affected people coming from different areas. The second case might have to do with the flooded extent in these areas, i.e., the damage assessment data only show the percentage of damaged infrastructures and say nothing about the flooded areas. Therefore, even with fewer damages, the water level (e.g., in the Esturro neighborhood) might have forced people to leave their houses to other areas less affected.
Moreover, based on Equation (
6) and the values of logarithms of arrivals in each administrative neighborhood from CDRs (
) and the arrivals in each administrative neighborhood from DTM
, a standard linear regression model was fitted and correlation coefficient between the two variables was determined as shown in
Figure 14.
Figure 14 reveals a promising correlation coefficient (over 70%) between the arrivals in each administrative neighborhood from DTM and the arrivals in each administrative neighborhood from CDRs. Therefore, if one can trust the model derived from the relationship between arrivals in each administrative neighborhood from CDRs and DTM, actionable knowledge can be extracted that can be used to predict the demand for disaster support in areas where no information is available.
However, some factors might have contributed negatively to the comparison results. The first factor has to do with the DTM sample data, which were limited (only data from five administrative neighborhood were available). The second factor is the data collection period used to estimate DTM (2 to 13 May 2019), which was a month after the period used to estimate the displacement matrix from CDRs (6 March to 8 April 2019). It is likely that when IOM and INGC conducted the survey, many people had returned to their origin administrative neighborhoods. Thirdly, the survey was conducted only in the official identified shelters, but many people might have moved from their origin areas and likely were hosted by their families who were less affected by the disaster in other administrative neighborhoods. This argument is supported by the idea that when the cyclone hit Beira city, many official shelters were damaged, and the available ones were not enough to host everyone. The fourth factor has to do with coverage of CDRs used in this research; i.e., the estimated displacement matrix is relative to a particular mobile network company with its customer market share, under the assumption that the mobile network operator users are regularly distributed over the study area, which might not be true for some cases. Finally, CDRs only capture the mobility of people with mobile phones that were used before and after the disaster. However, the DTM captures the mobility of everyone who moved to shelters, i.e., the mobile phone users from the company that provided the data, mobile phone users from other companies, and non-mobile phone users including children, which might add bias in the comparison.
Even though the correlation is not very high, the CDRs data can still provide an idea about the people displaced by the cyclone. While the survey-based approach (by IOM and INGC) is time-consuming and demands human and financial resources to collect data to estimate the DTM, in some countries, mobile phone network companies provide anonymized CDRs for research institutions for social benefit with very low cost or no cost at all. If provided in a real-time fashion, CDRs can be used to derive the near-real-time displacement matrix, which can be used to support the disaster response teams in their activities.
4.3. Discussion
The study area consisted of 51 cell-towers which were distributed according to the population density, i.e., dense in highly populated areas (city center) and less concentrated in rural areas. Before the disaster, all the cell-towers registered the users’ mobile phone activities regularly. However, when cyclone Idai struck Beira city, some cell-towers failed to register the users’ mobile phone activities, probably due to electricity failure. This happened not only right after the disaster but was extended to 8 April 2019 (the last day of the analysis period).
Moreover, the study area consisted of 27 neighborhoods. Using the areal interpolation method proposed by Flowerdew et al. [
28], it was possible to assign the mobile phone users in each cell-tower to the neighborhoods. This approach assumes that a cell-tower can be shared by different neighborhoods, which was not considered by Bengtsson et al. [
23] and Wilson et al. [
25], who assigned mobile phone users found to be living in each particular area covered by the cell-tower to a neighborhood where its coordinates fall into. As expected, all the neighborhoods in the study area were assigned mobile phone users, even in those cases where the main cell-tower was out of order for almost the entire period after the disaster (Nhangau neighborhood).
However, cell-towers’ coverage areas were estimated using the approach proposed by Okabe et al. [
26], which assumes that there is a homogenous distribution of mobile phone users within each Voronoi polygon. This is valid in highly populated areas with dense distribution of cell towers (usually within the city center). However, when it comes to those areas with less population density (rural areas), this approach is not realistic, because each antenna is designed to cover at maximum a particular area and due to the sparsity of cell towers, i.e., average distance between neighboring antennas over 7500 m in the rural environment, some areas in the study extent are not supposed to be covered [
45]. In addition to this, the digital elevation model should be taken into consideration when estimating the coverage area of cell-towers [
46].
The derived displacement matrix shows that a considerable number of people left their origin neighborhoods after the cyclone. This is supported by the damage infrastructure data; i.e., in the most affected areas, there is a high outflow and low inflow, and in the less affected area, there is high inflow and low outflow. However, in some cases, even though the area was highly affected (e.g., Mananga and Matadouro), the inflow was higher than the outflow. This is probably due to the availability of shelters; i.e., even though the area was highly affected, there were enough shelters for the affected people coming from different areas. The opposite behavior was observed in Esturro neighborhood, where, even though the area was less affected, the inflow was relatively less than the outflow. This situation can be explained by the flood extent in this area; i.e., even with fewer damages, the water level in the area might have forced people to leave their houses to other less affected areas.
The comparison results between arrivals in each neighborhood extracted from CDRs and from DTM show an encouraging correlation coefficient (over 70%). Some factors can be pointed out as probable reasons for this encouraging but still low correlation coefficient. The first reason could be the fact that there is no overlap between the period of CDRs data collection and the survey conducted by IOM in collaboration with INGC used to derive DTM. Therefore, it is not possible to trace the DTM-related displacement during the time and compare it to the CDR-based results. In addition to that, in some neighborhoods located on the border of the study area, there is high probability that the arrivals presented on DTM are from outside the study, which may add bias in the comparison result since this was not considered in the CDRs-based analysis. The shared repository of the survey is available in DTM [
47].
Since all the subscribers found to be living outside the study area before the disaster were ignored from the analysis, the method did not consider the inflow from response teams coming from different parts of Mozambique and other countries. However, by analysing the relationship between population distribution before the disaster derived from simulations analysis [
48], infrastructure damages [
49], and population behavior, a displacement hot-spot would be identified before a disaster, which is valuable information for rapid disaster response. In addition to this, the method only focused on two periods, before the disaster (from 6 to 30 March), and after the disaster (from 2 to 8 April 2019), and the dynamics right after the disaster were not analyzed.
5. Conclusions and Future Work
In this research, an alternative method for displacement matrix after a severe disaster is proposed. The method uses mobile phone data, well known as CDRs, to estimate the spatial distribution of displaced people in each administrative neighborhood. The estimation starts by creating the Voronoi tesselation of the study area based on the coordinates of the cell-towers. Then, users’ home location before and after a disaster is estimated. Based on the area of a Voronoi polygon that falls in each administrative neighborhood, mobile phone users found to live in this particular polygon are assigned to a different neighborhood using an areal interpolation method. Then, the users are labeled as displaced or non-displaced and then are aggregated based on the neighborhood before and after a disaster to derive the displacement matrix. However, the displacement matrix from CDRs only accounts for the mobile phone users’ movement, not the population flow. Therefore, the estimated flow from CDRs was scaled up using the population data from the census survey. To evaluate the estimated displacement matrix, the derived inflow, remain, and outflow in each neighborhood was compared with the percentage of damaged infrastructures derived from remote sensing techniques. Furthermore, based on the relationship between inflow data (available only for five neighborhhods) from survey conducted by the International Organization of Migration (IOM) in collaboration with Mozambique’s National Institute of Disaster Management (INGC) and inflow from CDRs, actionable knowledge was extracted, which can be used to predict the demand for disaster support in areas where there is no inflow information available.
However, while this study only focused on estimating the displacement only during two periods, before and after a disaster, it would be interesting to test the method for multiple periods after a disaster to capture the dynamics of the mobility of people after a disaster and hence determine the rate of return to the origin areas. In addition to that, by analyzing long period CDRs after a disaster, it would be possible to capture the resettlement areas of the displaced people. Furthermore, it would be interesting to evaluate the dynamics of people immediately after the disaster, instead of considering only two periods as proposed in this paper. Therefore, a method to estimate the spatial distribution of displaced population immediately after disaster (when more cell-towers were out of order) needs to be developed. Moreover, while the approach presented in this paper focuses on estimating the number of displaced people after a disaster, it would be interesting to see correlation between the result of this study with the displacement hot-spot derived from the relationship between population distribution before disaster estimated using simulation models and the effects of damages on the population’s behavior.