Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction

Pu, Tongzheng; Huang, Chongxing; Yang, Jingjing; Huang, Ming

doi:10.3390/su151310584

Open AccessReview

Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction

by

Tongzheng Pu

¹,

Chongxing Huang

²,

Jingjing Yang

^1,* and

Ming Huang

^1,*

¹

School of Information Science and Engineering, Yunnan University, Kunming 650500, China

²

Faculty of Social and Historical Sciences, University College London, London WC1E 6BT, UK

^*

Authors to whom correspondence should be addressed.

Sustainability 2023, 15(13), 10584; https://doi.org/10.3390/su151310584

Submission received: 21 May 2023 / Revised: 28 June 2023 / Accepted: 3 July 2023 / Published: 5 July 2023

(This article belongs to the Special Issue Socioeconomic Modelling and Prediction with Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

As a fundamental, holistic, and strategic issue facing human society, human migration is a key factor affecting the development of countries and cities, given the constantly changing population numbers. The fuzziness of the spatiotemporal attributes of human migration limits the pool of open-source data for human migration prediction, leading to a relative lag in human migration prediction algorithm research. This study expands the definition of human migration research, reviews the progress of research into human migration prediction, and classifies and compares human migration algorithms based on open-source data. It also explores the critical uncertainty factors restricting the development of human migration prediction. Based on the analysis, there is no “best” migration prediction model, and data are key to forecasting human migration. Social media’s popularity and its increase in data have enabled the application of artificial intelligence in population migration prediction, which has attracted the attention of researchers and government administrators. Future research will aim to incorporate uncertainty into the predictive analysis framework, and explore the characteristics of population migration behaviors and their interactions. The integration of machine-learning and traditional data-driven models will provide a breakthrough for this purpose.

Keywords:

human migration; prediction; methods; artificial intelligence; data; uncertainty

1. Introduction

With rapid economic and social development occurring worldwide, human migration and mobility between urban and rural areas, between cities, and between countries have become more convenient, and human migration (HM) has become a universal phenomenon. For instance, in China alone, as of November 2021, the inter-provincial mobile population was 124,837,153, and the intra-provincial mobile population was 250,979,606 [1]. A report issued by the United Nations Population Division (United Nations: Geneva, Switzerland, 2020) predicted that the number of global migrants in 2020 would be 281 million [2]. While the pace of HM slowed due to the COVID-19 pandemic, the gradual improvement in the pandemic, and the adjustment of relevant policies, have once again led to global increases in internal and international HM [3]. Given the significant decrease in the global population growth rate, and the gradual increase in the ageing population, HM has become a significant component of population growth in many countries and regions. Policy-makers in labor, healthcare, education, and other such areas must fully consider HM development dynamics and trends, to ensure that the policies they formulate are forward-looking, targeted, and effective. To accurately understand HM trends, extensive research has been conducted by governmental organizations, academics, and industries. HM forecasting has, therefore, become a vital research hotspot in the field of population studies.

International migration has received greater attention than internal migration, due to its multifaceted effects, policy importance, and greater visibility, leading to greater theoretical and empirical results in that field. Early studies of human migration prediction (HMP) mainly focused on the analysis of HM drivers, and the establishment of a relationship equation between the HM drivers and the number of migrants, in order to predict future HM development trends [4]. Studies show that the prediction models used in internal and international HM forecasting are remarkably similar, differing only in data sources and policies; models from both types of HM can be used to support each other in the development of improved forecasting methods [5]. In the 1940s, Zipf proposed a gravitational model to predict HM [6]. In the late 1950s, Bogue proposed the famous “push-pull theory” to describe the motives of HM from a kinematic perspective [7]. Later, Lee systematically summarized the push-pull hypothesis, and further outlined the factors influencing migration behavior [8]. In recent years, scholars have conducted empirical analyses of these models, and have continuously optimized them and improved their accuracy based on applied examples [9,10]. With the development of information technology and statistics, many forecasting methods have emerged, using approaches based on econometrics, time series analyses, and Bayesian statistics, among others. In recent years, with the rapid development of artificial intelligence (AI) technology, with big data and machine learning (ML) as its core, some scholars have attempted to use ML technology for HMP [11,12,13,14,15,16,17,18]; however, this approach is hampered by limited HM data, varying standards, and access difficulties. In addition, the uncertainty of HM drivers, and the difficulty in quantifying them, have led to the slow development of HMP research [4].

A number of systematic reviews of HMP research have been conducted, providing a comprehensive overview of the current status of HMP research, and a reference for further investigation [19,20,21]. Nevertheless, with the rapid development of artificial intelligence, and the publication of numerous related studies, it seems necessary to conduct a comprehensive review of population migration prediction methods. There have been many achievements in various migration fields over the past few years, given that population migration is a complex social phenomenon. By synthesizing studies on international and internal migration forecasting in a common context [22], this paper aims to provide a systematic overview of the academic literature and practice in migration modeling and forecasting. In line with this review, we also discuss traditional and machine-learning methods, as well as the advantages and disadvantages of each in population forecasting; and available public datasets, methods, and models are also provided for researchers with some basic knowledge interested of, and interest in, population migration forecasting.

In Section 2, we first review the development of HM theory, although the following is not a complete picture of all the immigration theories, but a basis for further reading. Next, we provide the basic process of population migration prediction, and the enrichment of artificial intelligence methods in HM prediction methods, including a list of the main methods, data sources, and spatiotemporal characteristics of migration prediction. In Section 3, we describe the state-of-the-art HMP methods, and expand on the AI methods. The state-of-the-art methods include the deterministic and stochastic methods. Machine-learning methods include the artificial neural network, random forest, support vector machine, recurrent neural network, and graph neural network models. We also provide guidance on the benefits and limitations of these methods. In Section 4, the uncertainties that lead to the slow growth in HM are analyzed. Finally, conclusions are drawn by analyzing the current status of HMP research, and pointing out the development direction of future research.

2. Research: Related Works and Methodology

2.1. Theoretical Review

The geographical or spatial flow of populations between two regions usually involves the change of a permanent residence from the place of departure to the location of arrival. This is called permanent migration [23]. With the rapid development of transportation and information networks, and the acceleration of globalization, the frequency and time spent traveling are increasing, travel distances are growing, and permanent residence is decreasing [24]. Globally, approximately one out of every 30 people is a cross-border migrant; in China, approximately one out of every 12 people is a trans-provincial migrant. In HM projection, it is common practice to ignore internal migration projection studies, and concentrate on international migration, despite the latter comprising only a small proportion of the mobile population. As a result, government organizations do not have a comprehensive and accurate understanding of the entire process of population movement [25]. Despite significant differences in the scale of these two phenomena, studies have shown that the evolution of international migration can be explained to a large extent by internal migration, as there is significant complementarity between the two [26,27,28]. In addition, tourism migration, as another form of population flow, overlaps and has a causal relationship with migration in many respects [29]. It is difficult to draw a clear line between these two phenomena [24]. It is, therefore, necessary to dismiss the generally limited definition of HM, and adopt a systematic approach, in order to properly understand the interconnections in HM, and to ultimately better support policy. This will aid in comprehending the reasons for HM, and accurately predicting HM trends [30].

Compared with other areas of population research, the research on HM theory is relatively unitary, and can be traced back to British statistician Ravenstein’s Laws of Migration [31]. Since then, researchers and specialists in fields including population geography, socioeconomics, and political economy have combined their research to propose several related theories, including the neoclassical economic theory, labor market theory, world system theory, migration network theory, and cumulative causality [32,33]. These theories have helped to lay the groundwork for HMP. In traditional theoretical research, a relatively systematic and mature academic system has been established for HM. Many empirical studies have been carried out, focused on analyzing the relationship between HM and urbanization, as well as the spatial characteristics, policies, causes, and influencing factors of HM. With the fast growth of technologies such as big data and AI, researchers have increasingly been looking into this topic. Based on many empirical studies [34,35], they have developed theories such as the spatiotemporal network of HM, which constantly adds to the theoretical system of HM.

HM is a complex demographic phenomenon. Most existing migration theories only partially describe the migration phenomenon, and cannot comprehensively and systematically capture its patterns, as it is difficult to directly apply them in migration forecasting (except for the push–pull theory). Researchers commonly translate theories into mathematical models in migration forecasting. Most previous prediction studies have therefore been based on mathematical models, rather than on a particular theory [36]. This study thus provides only a brief description of some of the migration theories.

2.2. Problem Statement

Human migration can be measured in two ways: stocks and flows. Migration stocks refer to the number of migrants in a given place at a given time. Flows refer to events that occur within a given period of time (i.e., migratory movements). The migration stocks change over time, owing to the migrant inflow and outflow. Generally, stocks are easier to measure than flows, because migration dynamics are more difficult to capture [37]. Accordingly, HMP tasks can be divided into two main categories: forecasting HM, and predicting future HM development. The basic task of HMP is to forecast the scale, characteristics, and development of HM, based on its influencing factors, current situation, and developmental trend. Another task is predicting future HM development changes via space–time series abstraction, based on spatiotemporal information, population characteristics and events, and other information. Specifically, the HMP process framework includes data acquisition and pre-processing, feature extraction and correlation analysis, model construction, model application, and prediction result outputs, as shown in Figure 1.

In forecasting population movements, the most important task is to analyze data and to explore the driving factors based on different scenarios. Therefore, migration forecasting involves mining potential patterns from historical data, as well as analyzing driving factors. Economic, political, social, cultural, demographic, and environmental factors are often cited as driving forces for migration [38]. However, population migration is a decision made in the context of individual or group needs, opportunities, challenges, constraints, urgency, and uncertainty. Therefore, migration drivers are situational and contextual. That is, complex migration scenario drivers are specific to the time and place when the desire to move is formed, and decisions are made. Often, a complex combination of economic, political, social, and other developments and events, not a single driver, dynamically affects migration opportunities, and migrants’ willingness and ability.

In the early stage, due to limited technical capacity, HMP used a single mathematical method. With the development of information technology and computerized statistics, HMP based on statistical methods has come to the fore, with an emphasis placed on the use of mathematical tools to collect relevant data, and the use of small-sample data to predict population movement issues. However, prediction bias and other problems can occur, due to incomplete information [39]. In recent years, prediction methods based on big data and AI technology have been proposed [40]. Artificial intelligence technology helps to inform immigration policy and management, using meaningful information extracted from social media data to predict individuals’ intentions of migrating, as well as predicting migratory movements in times of conflict or natural disaster [39,41].

2.3. Research Methodology

In this review, a systematic approach was used to collect, evaluate, and classify relevant studies. Firstly, in terms of research area, methodological scope, and research background, articles located in digital libraries and online search engines, including Web of Science, Springer JSTOR, Science Direct, Wiley Online Library, and Google Scholar, were searched using the custom keywords “migration”, “prediction”, and “method”. As of 2022, 527 results were retrieved for review. In the second step, all the research results were screened with customized excluded keywords (e.g., data migration, software migration, bird or fish migration). Literature tracking was performed, to add potentially relevant studies using the literature wizard linked paper, and a total of 320 papers were finally obtained. We then independently reviewed the resulting papers for consistency with the application scenarios and classification methods of this study, selecting a total of 54 method-specific papers. As shown in Figure 2, we compared the publication trends in the field of population migration prediction, by plotting the number of references to the main prediction methods. As shown in the figure, there have been a considerable number of new research papers on population displacement forecasting, using the neural network family of techniques, since 2019.

In this review, HMP methods are divided into traditional and machine-learning prediction methods. According to the classification of the prediction results, traditional prediction models can be divided into deterministic and stochastic approaches [19,21,35]. Deterministic methods are also called fixed-value forecasting methods, and scenario forecasting methods. Since deterministic forecasting methods are mainly applied in developed countries or international organizations (e.g., United Nations, Eurostat), and by few researchers, there are also few related research papers, so this paper does not list the specific methods. Machine-learning prediction methods are divided into the classical machine-learning method, and deep-learning methods. Table 1 lists the main methods, data sources, and spatiotemporal characteristics of migration prediction, and reveals that HM data are mainly in the form of government statistics, based on the extant literature; many studies focus on the prediction of international HM. Moreover, while most scholars have used the dichotomy method to forecast internal and international HM, some use the comprehensive prediction method. Due to the limitations of the data, most scholars have chosen traditional forecasting methods (econometric models, for the most part) for HMP, and have primarily focused on the long-term changes. In contrast, new AI methods mainly focus on short-term forecasting.

3. Human Migration Forecasting Approaches

Research on migrant prediction is diverse, with a large number of researchers using qualitative research techniques (ethnographic fieldwork, in-depth interviews, etc.). The main focus of this paper is on quantitative methods for forecasting population migration, and only a brief description of the qualitative methods will be given. As mentioned in Section 2, we will introduce methods of predicting population migration, and provide empirical examples based on deterministic, stochastic, and artificial intelligence methods.

3.1. Deterministic Methods

As a traditional population forecasting method, deterministic approaches are based on the assumption of relevant influencing factors, the independent belief of expert experience, or simple extrapolation, resulting in limited predictions (‘high’, ‘medium’, and ‘low’). The deterministic prediction method has the advantages of being a simple concept; it has low data dependency, and can fully utilize the subjective prediction results of experts. Moreover, the technique is simple in operation, suitable for medium- and long-term prediction, and widely used in many countries and international institutions, such as the United Nations (UN) and the European Union (EU) [81]. However, deterministic methods often ignore the influence of many uncertain factors. The limited forecast results do not answer the question, “To what extent will the future development of population migration be medium, high, or low?”, or, “What is the probability of high/medium/low population migration in the future?” This leads to an insufficient ability to interpret the prediction results. In addition, forecasting results are largely dependent on the knowledge of experts, and can be influenced by the scope of expert knowledge, political stance, or social attitudes, which can easily be misleading [65].

3.2. Stochastic Methods

In contrast to deterministic methods, stochastic method model parameters are not fixed, but are considered as random variables. Population migration consists of many random elements; therefore, in order to get closer to reality, some unavoidable random factors are considered in the population migration model, to build a random model. In this regard, many scholars use stochastic models to effectively predict population migration [66,72,73]. Of course, the analysis of stochastic systems is much more difficult than the analysis of relatively deterministic systems. Model parameters are difficult to deal with when using stochastic models to describe the migration process, owing to the lack of data, and econometric models based on sample data are now widely used to predict population migration. In addition, the development of linear and nonlinear migration theory has promoted research into the gravitational model, the time series model, the Bayesian model, and other population migration prediction methods, making the stochastic model one of the most effective methods for population migration prediction.

3.2.1. Econometric Forecasting Models

Econometric models are mainly used to study causal relationships between variables of interest, and they are convenient for revealing the relationships between the relative amounts of variable change; thus, they are widely used in HM forecasting studies. Ordinary least squares (OLS), the generalized method of moments (GMM), several versions of the random effects (GLS), seemingly unrelated regressions (SUR), and others are among the most commonly used models for HMP. In 1981, Plaut used econometric models to forecast net population movements in Texas, USA [74]. The net civilian migration (

{N C M}_{t}

) in the time period t is modeled as:

\ln [1 + (\frac{{N C M}_{t}}{{C P O P}_{t - 1}})] = λ l n α_{0} + λ β_{1} l n R W_{t} + λ β_{2} l n R {V U}_{t} + λ α_{1} l n {U S Y}_{t} - λ l n {C P O P}_{t - 1} + μ_{t}

(1)

where

R W_{t}

is the real wage rate,

R {V U}_{t}

is the vacancy-to-unemployment ratio in Texas relative to the United States,

{U S Y}_{t}

is the real per capita income in the United States, and

{C P O P}_{t - 1}

is the anticipated net civilian population of Texas.

Later, in response to the problem of HM forecasting in the EU-enlargement, Fertig and Schmidt provided a simple econometric model to forecast the rate of HM from the Czech Republic, Estonia, Hungary, and Poland (four candidate countries) to Germany [82]. The migration rate (

m_{s, t}

) in the relevant age range for the origin country s and the period t is given by:

m_{s, t} = μ + ϵ_{s} + ϵ_{t} + ϵ_{s, t}

(2)

where μ is an overall intercept term,

ϵ_{s}

is a random component specific to country s but persistent over time t,

ϵ_{t}

is a component specific to time periods but relevant for all countries at this point in time t, and

ϵ_{s, t}

is an unpredictable white noise error term

ϵ_{s, t}

.

Based on this, Dustmann et al. improved this method, by adding a relative per capita income variable to predict European migration after the enlargement of the EU in 2004 [83]. For a sending country s in the year t, the aggregate migration rate (

m_{s, t}

) is given by:

m_{s, t} = μ_{s} + X_{s, t} β_{s} + δ m_{s, t - 1} + ε_{s, t}

(3)

where the parameter

μ_{s}

captures all unobservable aspects of the process that are specific to country s but constant over time,

X_{s, t}

is the observable time-varying characteristics of country s at time t,

β_{s}

are vectors of unknown parameters to be estimated, and

ε_{s, t}

is the error term reflecting all unsystematic influences on the process. The method applied in that paper does not fully take into account the non-stationary nature of HM, but treats it as a stationary process. As a result, the model includes a large error in post hoc tests under the influence of relevant policies. Nevertheless, the method was found to be valid for forecasting in Germany, where temporary restrictions on access to the labor market were imposed.

A similar study, carried out by Alvarez-Plata et al., used actual income levels, employment rates, population size, and geographic and cultural similarity dummy variables to build a predictive model for HM from central and eastern European (CEE10) to European Union (EU15) countries [84]. Specifically, they model the share of migrants from country h residing in country f expressed as

{m s t}_{f h t}

:

{m s t}_{f h t} = α + (1 - δ) {m s t}_{f h, t - 1} + β_{1} \ln (\frac{w_{f t}}{w_{h t}}) + β_{2} \ln (w_{h t}) + β_{3} \ln (e_{f t}) + β_{4} \ln (e_{h t}) + β_{5} \ln (P_{f t}) + Z_{f h} γ + u_{f h t}

(4)

where w is the wage, e is the employment rate, and

P_{h}

is the population in the home country.

u_{f h t} = μ_{f h} + v_{f h t}

,

μ_{f h}

denotes a country-specific effect, and

v_{f h t}

is white noise.

The determinants of HM can be selected and quantified. Using HM theory, Cappelen et al. used econometric models to set variables, such as the income level, unemployment rate, and population size of Norway and of migrant countries of origin, as well as the number of immigrants already living in Norway, to create an overview prediction of immigrants to Norway [66]. The model is given by:

\log (\frac{M_{j}}{{P O P}_{j}}) = β_{0}^{*} + β_{1}^{*} \log (\frac{Y_{N O R}}{Y_{j}}) + β_{2}^{*} \log (\frac{{I S}_{j}}{{P O P}_{N O R}}) + β_{3}^{*} U_{N O R} + β_{4}^{*} U_{j} + β_{5}^{*} D_{j}

(5)

where the left-hand-side variable is the log of the migration group j,

{Y_{N O R} / Y}_{j}

is the relative per cap income,

{I S}_{j} / {P O P}_{N O R}

is a proxy for the migrant network,

U_{N O R}

is the unemployment rate in Norway,

U_{j}

is the unemployment rate in the sending areas, and

D_{j}

is a vector of intervention dummy variables that captures political events relating to changes in either country that are relevant to people who try to enter Norway.

In addition, the change in income under the influence of exogenous variables was fully considered, leading to corresponding predictions. Econometric models have also been successfully applied to the prediction of various types of HM (such as skilled, college graduate, and labor force migration) [85].

While there have been many achievements, the econometric model has certain shortcomings when applied to HMP. In particular, missing variables are often a problem during variable selection. In addition, to reduce the research complexity, researchers tend to ignore some essential population characteristics, such as the population size and age structure. Moreover, the variable selection error is often a vital source of prediction error.

Given the shortcomings of the econometric model, Dao et al. selected and parameterized appropriate driving factors for international HM, built a socio-economic structural equation model, solved the question of parameters using historical data, and then predicted the two-way trend of international HM [85]. At time t, the utility of a type-s individual born in country I and living in country j is given by:

u_{i j, s, t} = \tilde{γ} l n w_{j, s, t} + l n v_{i j, s, t} + ξ_{i j, s, t}

(6)

where

w_{j, s, t}

is the wage rate attainable in the destination country j;

\tilde{γ}

is a parameter governing the marginal utility of income;

v_{i j, s, t}

stands for the nonwage income and amenities in country j (public goods, non-monetary amenities, and transfers minus taxes), and is netted from the legal and private costs of moving from i to j;

ξ_{i j, s, t}

is the random taste component, capturing heterogeneity in the preferences for alternative locations, in mobility costs, in assimilation costs, etc. The prediction results were found to be consistent with the actual situation, which supported the correctness and feasibility of the model.

Burzynski et al. optimized Dao et al.’s model to propose a comparable institutional equation model [86]. The model expanded the scope of international migration drivers by introducing new factors, including internal migration, technological change, and education. The quantitative analysis of the factors influencing the global distribution of highly skilled people (i.e., internal educational opportunities, the sectorial distribution of workers, and international migration) demonstrated that the uneven distribution of labor is a significant contributor to global inequality, that HM is one of the most powerful ways to alter the global distribution of highly skilled people, and that economic inequality within regions can affect the global distribution of highly skilled people. According to the researchers, the model is equally appropriate for predicting international population flow, and the working-age population.

Compared with other methods, the parametric model can be used to analyze the overall effects of individual indicators, and the relationships between them, and has a strong adaptability to complex systems.

3.2.2. Gravity Model

The gravity model is named for its morphology, similar to Newton’s law of gravity, which can effectively explain spatial interaction. The model is suitable for the analysis of regional flow, and has been widely used in the research and prediction of HM. In the 1940s, Zipf proposed a gravity model for HMP which holds that the HM between two regions is directly proportional to the population of the two areas, and inversely proportional to the distance between them, as given by the following equation:

M_{i j} ~ α \frac{P_{i} P_{j}}{d_{i j}^{β}}

(7)

where

M_{i j}

is the number of people migrating from region i to region j,

P_{i}

is the number of people in region i,

P_{j}

is the number of people in region j,

d_{i j}

is the distance between the two areas, and α and β are two undetermined parameters.

Based on the classical gravity model, people have primarily concentrated on the study of models based on actual HM data, which includes model optimization, empirical research, the explanation of global migration patterns, and the impact of economy, society, politics, culture, and climate on immigration, among others [87].

Beine et al. optimized the model with actual HM data [88]. The expected number of migrations between two countries, as a function of the source country’s ability to send migrants, the cost of migration between countries, and the relative attractiveness of the destination country. In gravity models, attractiveness generally refers to the economic beauty of a particular destination, as compared to that of other countries. Because data on expected income are difficult to come by, economists usually use GDP levels or related indices instead [89,90]. When considering the transfer cost, various possible factors are considered, such as the cost, psychological cost, new-language-learning cost, etc. [91]. However, in the classical gravity model, the distance parameter incorporates all of these factors; i.e., an increase in the distance between two countries leads to a rise in the cost of migration. In this regard, the fixed effects of similar countries can be regarded as dummy variables.

In addition, the gravity model can also be transformed into a multiple regression equation [88]. In the formula, there is a linear relationship between all coefficients and the migration flow, and each variable is independent of the others. While the principle of the model is straightforward, a critical problem is ignored; i.e., the individual and invisible characteristics in the process of HM are not considered. In this regard, Backhaus et al. studied the impact of climate change on bilateral migration using the gravity model, and increased the average temperature and precipitation of the immigration country based on the classical model [92]. Migration inflows from country i to country j in year t were modeled as

M_{i j t}

:

\begin{array}{l} l n M_{i j t} & = α_{0} + α_{1} {w t e m p}_{i t} + α_{2} {w p r e}_{i t} + α_{3} {G D P}_{i t} + α_{4} {G D P}_{j t} + α_{5} {D e m P r e s}_{i t} + α_{6} \ln {P o p u l a t i o n}_{i t} + α_{7} U_{j t} \\ + α_{8} {T r a d e}_{i t} + γ_{t} + ω_{i j} + ε_{i j t} \end{array}

(8)

where

{w t e m p}_{i t}

is the population-weighted average annual temperature in degrees Celsius,

{w p r e}_{i t}

is the average annual precipitation in millimeters,

{G D P}_{i t}

is the PPP-adjusted GDP per capita divided by a factor 1000 in the origin (destination) country in year t,

{D e m P r e s}_{i t}

is the share of young people in the country of origin’s working-age population,

U_{j t}

is the unemployment rate in the country of destination at time t,

{T r a d e}_{i t}

is the openness ratio in the country of origin at time t;

ω_{i j}

are the country-pair characteristics,

γ_{t}

are a set of year dummies that captures global shocks, and

ε_{i j t}

is the error term.

In addition, Friebel et al. added changes in smuggling routes, based on the gravity model, and studied the immigration costs that affect the willingness to migrate to a specific location [93]. For the individual i to move from country of origin o to country of destination d at time t, the binary indicator of migration intention (

M_{i o d t}

) is:

M_{i o d t} = β_{0} + β_{1} X_{i o t} + β_{2} D_{o d t} + v_{o t} + w_{d t} + u_{o d} + e_{i o d t}

(9)

where

D_{o d t}

is the time-varying (log of) distance, measuring the distance between the origin and destination country along migration routes.

X_{i o t}

is a vector of individual covariates, which include age, gender, education, household size, wealth, urban city residence, and satisfaction with local amenities.

u_{o d}

is the country-pair fixed effects (

u_{o d}

) together with a full set of origin-by-year (

v_{o t}

) and destination-by-year (

w_{d t}

).

Campos proposed an extended gravity equation that allows projections to be made for all pairs of countries in the world using small explanatory variables. The results showed that the number of migrants is projected to increase from 2.8% of the world population in 2010, to around 3.5% in 2050. Due to the assumption of migration–population relationships, without consideration of future policy changes and unexpected events, the prediction results of this method may be biased [53].

Nicolaie et al. proposed a simplified gravity model to study emigration in Romania. Because of limited access or incomplete statistical data, Romanian migration flows were analyzed only to certain EU countries, and at the EU-27 level, for the period 1995–2014. The analysis showed that for every 1% increase in unemployment, the number of emigrants increased by 2.57 [54]. The flow of migrants from region/country i to region/country j in time t is given by:

\ln M_{i j t} = β_{0} \ln M_{i j t} + β_{1} \ln A_{i t} + γ_{1} A_{j t} + β_{2} \ln Y_{i t} + γ_{2} Y_{j t} + β_{3} \ln C_{i j t} + g_{t} + f_{i j} + ε_{i j t}

(10)

where

Y_{i t} / Y_{i t}

means the income in location i/j at time t,

A_{i t} / A_{j t}

are amenities in i/j at time t,

C_{i j t}

are the costs of moving from location i to j,

g_{t}

is a time effect variable,

f_{i j}

are specific effects, and

ε_{i j t}

is the error term.

The nonlinear gravity model by Rikani and Schewe (2021), used to project future global migration trajectories, matches spatio-temporally pooled observed flows well. Although this model does a good job of reproducing past patterns and trends using fewer parameters, a gap remains between the predictions and real values at the country level [94].

In summary, based on the interpretation of attraction and distance, the gravity model is applicable to the study of HM-related issues, and has strong robustness. The flexible selection of parameters, such as the environment, politics, sociology, micro/macro-economy, and geography, among others, enables a better understanding of the driving factors in cross-border migration flows. Although the gravity model provides a reasonable explanation for the spatial pattern of HM, Beyer et al. found that the gravity model based on the time dimension does not perform well [95]. The existing methods are all discussed based on historical data, and the relevant variables are ideal for long-term stability. Unpredictable impacts, such as those of financial crises, war, climate change, or technological progress, are not fully considered, and it is therefore difficult for the results to be convincing.

Because the gravity model contains parameters to be estimated, parameter estimation requires a large amount of historical data. The calculation process is complex, so Simini et al. proposed the radiation model [9]. In this model, when choosing a destination, the total number of individuals is proportional to the population of the source and the destination, and decreases with distance. The movement is based on this criterion. This method is weak, and has limitations, because it only focuses on the flow between two specific points.

3.2.3. Time-Series Models

In the traditional HMP method, time-series analysis and extrapolation is another important method for migration forecasting. The classical models of time-series prediction mainly include the autoregressive (AR) model, the moving average (MA) model, and the autoregressive integrated moving average (ARIMA) model [61].

Generally, for a long enough series, the different migration processes (

m_{t}

) can be expressed by the unconstrained autoregressive model of the first order (AR (1)), with a constant c describing a stationary process whenever the autoregression parameter

Ø \in (- 1,1)

applies:

\ln (m_{t}) = c + φ \ln (m_{t - 1}) + ε_{t}

(11)

In addition, ARMA (1, 1) is also in effect, with a moving average element added to the AR (1) model above, with an additional parameter θ:

\ln (m_{t}) = c + φ \ln (m_{t - 1}) + ε_{t} + θ ε_{t - 1}

(12)

AR (1) modeling was used to forecast state-to-state migration rates from Free, based on the recently revised and updated US Bureau of the Census database. From this, it follows that recent rates are important in predicting internal migration in the near term [56]. Later, Beer used an ARIMA model to specify a statistical forecast interval in the Netherlands. The results showed that the uncertainty of immigration has been underestimated in official forecasts in the past [57].

To forecast international migration to and from the UK, Bijak et al. used time-series models with and without expert opinion, including ARIMA models, autoregressive distributed lag models, and past-error propagation. It was found that the low order ARIMA models perform better with stationary data [61].

A SARIMA model for predicting the entry into, and exit from Tokyo was developed by Shimizu and Shin. To improve the accuracy, factors such as the COVID-19 crisis were added. The results reflected that the model could be sufficiently used for such short-to-medium-term time-series data [63]. Similarly, three classes of model were employed by Fantazzini et al. for out-of-sample forecasting of interregional migration in Russia; these include short-term forecasting using ARIMA and Google-augmented ARIMA models, as well as multivariate models for long-term forecasting. The empirical analysis finds that including Google Trends data in a model enhances the prediction of migration flows [64].

The time-series prediction model can determine the characteristics, trends, and development rules of HM changes according to the time series, to effectively predict future modifications in HM. However, because the time-series forecasting method does not consider outside factors, there is a defective prediction error; when significant changes take place in HM policy, they tend to show a more substantial deviation, and will produce predicted results that do not tally with the actual situation [20]. Therefore, the effect of the time-series prediction method for short-term prediction is better than that for long-term forecasts.

3.2.4. Bayesian Prediction Model

Bayesian models are considered to be an extension of univariate time-series models, which use probabilistic methods as inputs. In HMP, the number of historical population movements is the only influence; therefore, the method is also considered a purely data-driven approach. Research results have shown that the Bayesian model is more flexible and practical for migration data deficiencies [96]. Due to the incompleteness of HMP data, Bayesian models can all be represented in a probabilistic manner, in which historical trends, expert judgments, and various models are combined in a probabilistic way.

To forecast immigration for seven European countries, Bijak presented a Bayesian model, based on quantitative data and qualitative knowledge elicited from country-specific migration experts in a two-round Delphi survey. In the combined Bayesian model, expert judgment can be used as a prior distribution for the different parameters [46]. The parameters are then updated according to the data.

Responding to the potential impact of the environment on migration, Abel investigated how to use Bayesian modeling to predict uncertainty about the level of immigration to the UK resulting from environmental factors elsewhere [47]. They considered a set of autoregressive (AR) time-series models based on the k-year history of immigration, AR (k), defined as follows:

\{\begin{matrix} m_{t} = μ + \sum_{i = 1}^{k} [φ_{i} \cdot (m_{t - i} - μ)] + ε_{t}, & t < 2010 \\ m_{t} = v + \sum_{i = 1}^{k} [φ_{i} \cdot (m_{t - i} - v)] + ε_{t}, & t \geq 2010 \end{matrix}

(13)

where

m_{t}

is the transformed immigration in year t, and μ and v are the mean level of

m_{t}

from the observed data series and forecasted future data series. The parameters for

φ_{i}

are the ensemble of the autoregression coefficients of

m_{t}

, related to its past history up to k periods (years) before

ε_{t} \sim N (0, σ_{c}^{2})

.

Later, the possible effects of Scottish independence on internal and international migration were studied by Wisniowski et al. [48]. The predictions presented in this paper were obtained from the results of the Bayesian forecasting model, and take into account different sources of uncertainty of future migration flows.

Azose and Raftery used a Bayesian hierarchical first-order autoregressive model or AR (1) model to achieve a fitted forecast for global HM rates [49]. In this model, the uncertainty of international migration is quantified based on the posterior distribution, by inputting demographic variables. The model enables the long-term forecasting of international migration, without causing an explosion of tension. The author modelled the migration rate

r_{c, t}

in country c and time period t as follows:

(r_{c, t} - μ_{c}) = ϕ_{c} (r_{c, t - 1} - μ_{c}) + ε_{c, t}

(14)

where

ε_{c, t}

is a normally distributed random with a mean of zero and a variance of

σ_{c}^{2}

,

μ_{c}

is the long-term average migration rate, and

ϕ_{c}

is the uniform prior on the autoregressive parameter,

ε_{c, t} \sim N (0, σ_{c}^{2})

.

Wiśniowski et al. developed a Bayesian approach to forecast immigration (counts) and emigration (rates) by age and sex for the UK, which can be adapted to different data types and information sources [97].

In the short term, the applicability of the model to other country situations and types of data has been tested by Raymer and Wiśniowski. Time-series data for Sweden, South Korea, and Australia were also used to validate the predictive accuracy and generalizability of the model [50].

The advantage of Bayesian models is that probabilistic, rather than quantitative, assessment is used to estimate the model parameters; i.e., there is a complete distribution in the Bayesian analysis, not just a single parameter. In a Bayesian model, the parameters are considered variables of a random distribution that are extracted from a specific distribution, and the type of distribution of the parameters is used as an additional input variable for the input data. By using this distribution, it is possible to simulate the data following a stochastic process, and to derive possible values for the parameters from the assumed distribution, using a data generation process.

3.2.5. Expert Prediction Model

Expert forecasting allows for timely adjustments to policy releases and expected changes, contains more information than model forecasting, and makes it easier to recreate the forecasting process, which has become an important method for population forecasting. A single forecasting model often contains only part of the information of the forecast object, and combining each single model according to certain rules can improve the forecast accuracy, by including more comprehensive forecast information. In this regard, in population migration prediction, expert prediction models rarely appear alone, and are often combined with other prediction models, in a process that can combine the advantages of both, and improve the prediction accuracy.

In the context of traditional temporal probabilistic prediction, Lutz and Goldstein proposed an expert-based probabilistic population prediction method [98]. The number of population movements can be obtained by the average trajectory of the population movement process, an a priori assumption derived from the subjective judgment of experts, and a selected stochastic process. However, for exceptional cases (e.g., war, disaster, etc.), expert experience may lose its usefulness, and lead to invalid or opposite prediction results. In response, an expert-based algorithm for forecasting population composition (including net migration) under Bayesian models was proposed by Billari et al. [99]. This method was then further extended by the researchers, to population in-migration and out-migration forecasting [100]. However, purely expert methods are limited by the use of too little data, and rely entirely on the subjective judgments of experts. The problem of bimodality may arise when experts make errors in their decisions, or when there are differences of opinion among expert groups.

In addition to the lack of a large amount of temporal HM data, some scholars use the grey model to take a portion of HM information as the research object [12,13]. A grey model is established by extracting sufficient information from known data to achieve an accurate description and grasp of HM development trends.

3.3. Machine-Learning Prediction Methods

Machine learning (ML) is an important branch of artificial intelligence (AI), the basic principle of which is to study how computers simulate human learning patterns, to automatically acquire knowledge and continuously upgrade their performance [101]. Compared with a traditional research model based on statistics and simulation, the machine-learning method can quickly and accurately extract effective associative information from historical data. In recent years, it has been widely used in HMP research.

According to different learning methods, machine learning can be divided into classical machine learning and deep learning. As shown in Figure 3, a range of machine-learning methods have been applied in HMP research, including illegal migration prediction, conventional migration prediction, labor migration prediction, migration flow data generation, migration trend prediction, international migration drivers, and asylum seeker prediction [14,15,16,17,18,77,78,79,80].

3.3.1. Classical Machine-Learning Prediction Method

Artificial Neural Network

As a model to simulate the structure, function, and computation of the biological neural network, an artificial neural network aims to achieve certain functions by simulating some mechanisms of the brain, such as image recognition and speech recognition, among others. Its main structure includes an input layer, a hidden layer, and an output layer. After years of effort and research, machine learning has shown strong advantages in the field of population migration prediction. The prediction of population migration based on the neural network takes the original data, or the features extracted based on the original measurement data, as the input of the neural network, and constantly adjusts the structure and parameters of the network using a certain training algorithm. The optimized network is then used to predict the development trend in HM.

Robinson and Dilkina were likely the first to use ML models in HMP. They used these models to address the inability of traditional linear models to model the non-linear relationship between population migration and its characteristics, while proposing a comprehensive solution to the problems of data imbalance, hyperparameter tuning, and performance evaluation in model training, thus providing a new tool and instrument for HMP [102]. The study successfully used machine learning as an emerging tool to predict development trends in internal and international population migration, demonstrating the advances in, and generalizability of, the prediction tool. Machine learning, therefore, offers a new reliable tool for the assessment of future developments in population migration, and the evaluation of migration management policies.

Tarasyev et al. constructed a multi-regional migration-unemployment-wage model that uses an inductive ML approach to explore labor migration trends, based on the migrant distribution, age structure, income level, cost of migrating, labor market conditions, regional employment and unemployment information, climatic conditions, and the distance between the countries of origin and destination, among other variables [103].

Subsequently, to improve the interpretability of machine-learning models, Kiossou et al. used an interpretable machine-learning approach to study the drivers of international migration with greater accuracy than the classical gravity model [104]. This approach also provides a deeper understanding of how migration is affected by drivers, effectively revealing the non-linear relationship between covariates and outcome variables. To solve the problem of predicting illegal migration, Azizi and Yektansani established a machine-learning model based on eight machine-learning techniques that effectively predict the legal status of individuals from Mexico in the United States, using data available from Princeton University’s Mexican Immigration Project [16]. Based on an adaptive machine-learning algorithm, Carammia et al. developed a Dynamic Elastic Net Model that integrates government statistics and social media data, to effectively predict asylum-related migration flows [77]. Giang, et al. proposed a BPNN model for forecasting labor production and labor migration, and the results show that this method can improve the forecasting performance, compared to the K-nearest neighbor (kNN) and random forest regression (RFR) models [17].

2.: Random Forest

Random forest is an algorithm that uses multiple decision trees to train, classify, and predict samples. First proposed by Breiman in 2001, it has mainly been applied to regression and classification scenarios [105]. The predicted value of random forest is the calculation result of multiple decision trees (forests), which is usually the mean or mode of the output value of all decision trees. The advantages of random forest include that it is simple to operate, has a fast training speed, and is not easy to fall into overfitting [106]. It has, therefore, become a popular tool for population migration modelling. In population migration prediction research, random forest is used to solve the regression problem in the following way. First, the autonomous sampling method is used to extract k samples from the original training set, and the sample size of each sample is the same as that of the original training set. Next, k decision tree models are constructed for the k samples, and the k regression results are obtained. Finally, the k decision tree results are combined, by taking the average values.

To address the problem of forecasting environmental migration, Best et al. proposed a random forest model that could effectively identify significant variables from large social surveys [18]. Random forests enable the ranking of variables by importance. In regression random forest models, the importance is calculated by the node impurity, which is a calculation of to what extent the variance of the result can be reduced by splitting the decision trees by a particular variable [15]. In this regard, this model can identify the most important predictors of migration from around 2000 original factors, allowing a regression analysis with fewer variables and more degrees of freedom.

Aoga et al. proposed a tree-based machine-learning (ML) method to predict the impact of weather shocks on individual migration tendencies in six agriculture-dependent economies: Burkina Faso, Cote d’Ivoire, Mali, Mauritania, Niger, and Senegal [107]. The results show that climatic factors have positive and significant effects on the predictive performance of individual migration intentions.

3.: Support Vector Machine

The support vector machine (SVM) has mainly been used to solve the classification and regression problems of machine learning, and is suitable for analyzing small samples and multidimensional data [108]. SVM differs from traditional neural network learning methods, and is based on the Vapnik–Chervonenkis dimension theory, and the structural risk minimization principle, which realizes the structural risk minimization principle. It also minimizes empirical risk, has a good generalization performance for future samples, and has many advantages, such as a simple structure, good adaptability, global optima property, fast training speed, and more generalized applications. The SVM-based HM forecasting method uses actual HM data to train the support vector machine model, determine the model parameters (e.g., the insensitivity coefficient, penalty factor, and kernel function parameters), forecast the future state based on the trained SVM model, and obtain the predicted value of HM, by comparing it with the pre-set failure threshold.

Zhang et al. employed the SVM algorithm to create a classification model for the migration of Beijing’s unregistered resident population, and conducted an empirical analysis of migration data from various surveys in Beijing [109]. The results show that SVM is more accurate and generalizable than the basic BP neural network and logical regression for these specific classification tasks, and can forecast migration trends with greater accuracy.

3.3.2. Deep-Learning Prediction Method

The traditional shallow machine-learning algorithm relies heavily on expert prior knowledge and signal processing technology, and it is difficult to automatically process and explore massive monitoring data. As a new technology developed from neural networks, deep learning, with its powerful feature-extraction capability, provides a solution for training massive data. However, in the field of migration research, due to differences in the relevant concepts, and insufficient human, material, and financial resources, the data collected on population migration are sparse or entirely lacking, leading to an inaccurate and out-of-date understanding of population migration. Social media data offer a new way to expand the timely perception of complete population migration information. With the acceleration of globalization, social media and traditional data complement each other, further meeting the needs of the efficient management of population migration.

Recurrent Neural Networks

A recurrent neural network (RNN) is a neural network with short-term memory capability. In an RNN, neurons can both accept information from other neurons, and accept their own information, forming a looped network structure. Compared with a feedforward neural network, an RNN is more consistent with the structure of the biological neural network. In tasks such as language modelling and natural language generation, because of the feedforward and feedback connections between neurons at each layer, RNNs are suitable for processing sequence data with back-and-forth dependency. The parameters of the recurrent neural network can be learned over time, using the back-propagation algorithm. The back-propagation algorithm with time is to transmit step-by-step error information, in reverse chronological order. A relatively long input sequence causes gradient explosion and disappearance problems. To solve the RNN problem, some improved structures have been proposed. Among these, gate current unit and long short-term memory (LSTM) are typical representatives that overcome the problems of gradient disappearance and gradient explosion of traditional RNN networks, leading the RNN structure to extract the deep characteristics of the time series, and also consider the long-term dependence of the time series. This allows the model to obtain better prediction results.

Golenvaux et al. used LSTM to predict international migration based on Google Trends data, using the thermal coding vector input tag to incorporate more complex time-invariant factors (such as the distance between two countries, and common language) [110]. The results show that the LSTM method is significantly superior to the standard artificial neural network and the traditional gravity model; in addition, this paper adjusts the LSTM structural model, and adds crises, to improve the accuracy of the model according to the abnormal impact of a particular year.

2.: Graph Neural Networks

Graph neural networks (GNNs), as a generalization of cyclic neural networks, are widely used due to their powerful ability to process complex graph data. By formulating certain strategies on the nodes and edges of the graph, a GNN converts the graph structure data into a standardized and standard representation, which it inputs into a variety of different neural networks for training, achieving good results in node classification, link prediction, graph clustering, and other tasks. The actual network can be mapped to the relationship between nodes and edges, and the GNN can be used to generate a graph from the unstructured data. The output does not change with the input order of the nodes. The edges represent the dependency between two nodes, and can update the state of nodes by relying on the surrounding state.

Terroso-Sáenz and Muñoz proposed a method for predicting population movement at the national level [78]. This method uses GNNs to consider the potential relationship between large geographical regions, and realizes the prediction of population movement between cities on national spatial granularity. In addition, they also introduced the impact of climatic factors on population flows. The results show that the effect of weather factors is not obvious, due to the mismatch between climate data and model data.

Additionally, the feasibility of the use of Twitter data to predict internal HM has been explored, to address the problem of an insufficient data volume for deep-learning prediction models [76,78]. The results showed that Twitter data have considerable value in HMP. Future studies could focus on the selection of pertinent data, and the design of efficient feature models to further the research on deep-learning-based HMP.

Although ML methods have seen some achievements in HM research in recent years, they are currently at a preliminary stage overall, and have had a limited impact on HMP research. Massive amounts of data are required to fully realize the predictive power of ML, and the current sample size of HM research is far from reaching the lower limit that would allow for accurate predictions. In addition, with the complexity of cross-border HM policy changes, the related models lack robustness, and the generalization of ML models is poor. Once again, neural networks are essentially a black box, and their reliability is difficult to guarantee. To promote the application of neural network methods in population migration prediction, it is possible, first, to use new media data to replace or augment traditional data, in addition to using the idea of migration-learning to reduce the amount of data required for model training. The next step is to take advantage of machine learning’s ability to parse large amounts of text, conduct sentiment analysis to reduce the impact of macroeconomic policies, and improve the robustness and generalizability of the models. Moreover, exploiting the model interpretability further improves predictive reliability; for example, Simini et al. used SHapley Additive exPlanations (SHAP) values to understand how different geographic features play a role in the model [111].

4. Uncertainty in Population Migration Projections

As the key to the success of HMP, datasets are traditionally collected using official statistics or survey data, compiled and published by relevant organizations. However, definitions of HM statistics differ from country to country. During the collection process, governments take a convenience approach, and tools developed for other purposes, rather than specifically designed to measure HM and its outcomes, are used [112]. Even in developed countries such as EU nations, information on migrating populations remains rudimentary. For instance, Cyprus, Ireland, France, Portugal, and the United Kingdom use surveys. However, Romania and Greece use other data sources, including mirror statistics from other countries. The remaining countries (e.g., Germany) use population registers, partly including registers for foreign citizens. The data do not capture the migration processes well, and are often not comparable in migration study. The quality of the data can be very problematic.

As a result, HM definitions and data quality issues often cause forecasting methods to be incomparable, with inaccurate forecasts, forecasting models that lack robustness, and significant uncertainty [67,112]. Moreover, the numerous social, political, demographic, economic, environmental, and technological drivers in HM forecasting are highly uncertain, and difficult to quantify. Their interactions lead to different migration outcomes, and bring significant uncertainty to HMP.

Although there has been an accumulation of theories and empirical studies related to HM, no single idea has proven comprehensive enough to cover the multiple forms of migration. The push and pull factors (determinants), or the drivers of migration and non-migration interact with each other, thus rendering a comprehensive explanation of the migration process impossible, even if they cover most cases. Therefore, HM theory has a limited role in the interpretation of the results. Moreover, HM forecasting involves different disciplines, and other experts have entirely different expectations about HM changes. The cumulative effect of all of these uncertainties hampers the development of HMP. Therefore, attention should be given to uncertainty discussions in all HM projections; otherwise, such uncertainties will spread.

5. Conclusions

This paper examined the work on HM forecasting, up to 2022. The results show that HMP is undergoing a rapid development, indicating that the input data have changed from simple stock information to multi-source spatiotemporal data, including spatiotemporal scene characteristics, and individual behavioral characteristics. The data also indicate that the prediction model and algorithm are evolving from a simple linear model to a nonlinear model. Most importantly, the prediction effect is becoming more refined and precise. Currently, HMP can be broadly classified into three types: deterministic, stochastic, and machine-learning HMP. Deterministic HMP estimates the future population migration development by setting model parameters in one or more scenarios; it has a low dependence on data, and can fully utilize the subjective prediction results of experts. The method is simple to operate, suitable for medium- and long-range forecasting, and widely used in many countries and international institutions (e.g., the United Nations, the European Union) [96]. However, the uncertainty of evaluation in the consistent and explicit quantitative sense has led to uncertain interpretations of results; the medium-range solution is the most likely forecasting method, which has also led to errors. Unlike the deterministic model, the stochastic model’s parameters are not fixed, but random variables. The stochastic prediction model describes the influence of driving factors on population migration by a statistical analysis or an assumed probability distribution based on historical data. The uncertainty in population movements is fully considered. As a result, the most probable situation and prediction range can be forecasted. In addition, the model’s prediction ability can be improved by introducing expert knowledge. However, stochastic models are highly dependent on variables and available data. They involve a high degree of subjectivity, and the operational difficulties make them unsuitable for beginners. With the development of big data and artificial intelligence technology, machine learning has enabled great progress in HMP. However, there is a long way to go in its further application. An important reason for this is that most machine-learning methods use social media data, which cannot effectively explain the causality of population migration, although their flexible choice of functional forms to fit data can effectively improve their prediction ability. In addition, data privacy and data protection problems have restricted the machine-learning method’s use in predicting the development of population migration. In the future, more data on population migration are expected to become available, so the model will become more sophisticated. Regardless of how population migration prediction evolves, the data and model algorithms must be reasonably selected, based on clear application scenarios and uncertainty, to achieve better results. The analysis and summary of population displacement projections presented in this paper can help develop and improve the field to some extent.

6. Outlook

Overall, HMP is a multidisciplinary research and application area. However, it has limitations, including a low data quantity and quality, insufficient support of system theory, insufficient innovative methods, and imbalanced research areas. Combined with the developing trend of big data and artificial intelligence in recent years, this paper has established that more steps need to be taken to ensure the future development of HMP.

Firstly, global population growth is gradually slowing. The driving force of population growth in many countries is primarily reliant on HM, which has consequently become an important issue related to the population and political security of some countries. Only by mastering actual HM data can scientific and practical measures be taken to ensure the stable development of economies and societies. To ensure the comparability and uniformity of data, the statistical institutions in all countries should strengthen cooperation and follow uniform statistical standards. In addition, to ensure the comprehensiveness and adequacy of data, data from transit countries or regions should be collected, in addition to data from countries or regions. There should be free public availability of data, and government agencies, industry, and academia should collaborate, to facilitate rapid research on migration prediction.

Secondly, with the development of big data and AI technology, the number, type, and fineness of data will be continuously improved, thus challenging the ability of prediction models. The dynamic change in HM depends not only on temporal differences, but also on the transformation of spatial characteristics [113]. Traditional model-driven methods are unsuitable for processing spatiotemporal series, because they cannot capture their hidden nonlinear features. As an efficient deep-learning framework based on the graph data structure, GNN is widely used in various fields, and has achieved remarkable results. HM flow data have the natural attribute of a graph data structure, and the application of GNNs will be an inevitable choice in the future. Therefore, HMP models based on GNNs will be an essential future development direction.

The development of HMP research will experience peaks and troughs in the application of big data and AI technology. Countries should use the opportunity for growth to solve the talent problem that has been a longstanding issue in HM research. Governments should plan to facilitate the migration of skilled people, rely on those universities and institutions that carry out population research, and work to train reserve talent with an AI technology base for the research and management of HM. In addition, different models can be used to improve the skills and abilities of government management teams on the job, thereby creating a talent pool that can be used to manage HM, and carry out research in the new era.

Finally, while relevant government agencies have obtained a large amount of stock data in the traditional process of HM management, they have been unable to make full use of these data, due to technical limitations. With a lack of technical support, management methods are often based on personal intentions, and the subjective opinions of managers. With the development of big data and AI technology, the acquisition, preservation, and processing of HM big data has become possible. Important directions for future development will be to use big data to improve the management level of HM, to establish and improve the mechanism of scientific decision-making and social management of HM, to promote the innovation of government management and the social governance mode, and to strengthen research into, and development of, HMP and intelligent auxiliary decision-making systems.

Author Contributions

T.P., C.H., J.Y. and M.H. contributed equally to the authorship of the manuscript, including the research design, conducting the research, performing the analysis, and writing the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NSFC), grant numbers 61963037, 61863035, 62261059.

Conflicts of Interest

The authors declare no conflict of interest.

References

China’s State Council. China Population Census Yearbook 2020; China Statistics Press: Beijing, China, 2022; pp. 10–23. [Google Scholar]
United Nations. International Migrant Stock 2019; (United Nations Database, POP/DB/MIG/Stock/Rev.2019); Department of Economic and Social Affairs, United Nations: Geneva, Switzerland, 2019. [Google Scholar]
Sharma, D.R.; Kandpal, D.V. COVID 19 pandemic and international migration: An initial view. Sustain. Oper. Comput. 2021, 2, 122–126. [Google Scholar] [CrossRef]
Willekens, F.; Massey, D.; Raymer, J.; Beauchemin, C. International migration under the microscope. Science 2016, 352, 897–899. [Google Scholar] [CrossRef] [PubMed] [Green Version]
King, R.; Skeldon, R. “Mind the gap!” integrating approaches to internal and international migration. J. Ethn. Migr. Stud. 2010, 36, 1619–1646. [Google Scholar] [CrossRef]
Zipf, G.K. The P 1 P 2/D hypothesis: On the intercity movement of persons. Am. Sociol. Rev. 1946, 11, 677–686. [Google Scholar] [CrossRef]
Bogue, D.J. Internal migration. In The Study of Population; Hauser, P.M., Duncan, O.D., Eds.; University of Chicago Press: Chicago, IL, USA, 1959. [Google Scholar]
Lee, E.S. A theory of migration. Demography 1966, 3, 47–57. [Google Scholar] [CrossRef]
Simini, F.; González, M.C.; Maritan, A.; Barabási, A.L. A universal model for mobility and migration patterns. Nature 2012, 484, 96–100. [Google Scholar] [CrossRef] [Green Version]
Brockmann, D.; Hufnagel, L.; Geisel, T. The scaling laws of human travel. Nature 2006, 439, 462–465. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sun, Y.; Pan, K. Prediction of the intercity migration of Chinese graduates. J. Stat. Mech. Theory Exp. 2014, 12, P12022. [Google Scholar] [CrossRef]
Geng, Y.; Wang, R.; Wei, Z.; Zhai, Q. Temporal-spatial measurement and prediction between air environment and inbound tourism: Case of China. J. Clean. Prod. 2021, 287, 125486. [Google Scholar] [CrossRef]
Pu, T.; Huang, M.; Yang, J. Forecasting international migrants using grey model with heat label. In Proceedings of the 5th International Conference on Computer Science and Software Engineering (CSSE), Guilin, China, 21–23 October 2022; pp. 652–656. [Google Scholar] [CrossRef]
Weber, H. How well can the migration component of regional population change be predicted? A machine learning approach applied to German municipalities. Comp. Popul. Stud. 2020, 45, 143–178. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
Azizi, S.; Yektansani, K. Artificial intelligence predicting illegal immigration to the USA. Int. Migr. 2020, 58, 183–193. [Google Scholar] [CrossRef]
Giang, N.H.; Nguyen, T.-T.; Tay, C.C.; Phuong, L.A.; Dang, T.-T. Towards Predictive Vietnamese Human Resource Migration by Machine Learning: A Case Study in Northeast Asian Countries. Axioms 2022, 11, 151. [Google Scholar] [CrossRef]
Best, K.; Gilligan, J.; Baroud, H.; Carrico, A.; Donato, K.; Mallick, B. Applying machine learning to social datasets: A study of migration in southwestern Bangladesh using random forests. Reg. Environ. Chang. 2022, 22, 52. [Google Scholar] [CrossRef]
Disney, G.; Wiśniowski, A.; Forster, J.J.; Smith, P.W.F.; Bijak, J. Evaluation of existing migration forecasting methods and models. In Report for the Migration Advisory Committee; University of Southampton: Southampton, UK, 2015. Available online: https://www.gov.uk/government/publications/evaluationof-existing-migration-forecasting-methods-and-models (accessed on 18 January 2022).
Sardoschau, S. The Future of Migration to Germany. Assessing Methods in Migration Forecasting. DeZIM Briefing Notes 4, Berlin: Deutsches Zentrum für Integrations- und Migrationsforschung (DeZIM). 2020. Available online: https://policycommons.net/artifacts/1930344/the-future-of-migration-to-germany/2682114/ (accessed on 18 January 2022.).
Vanella, P.; Deschermeier, P.; Wilke, C.B. An Overview of Population Projections—Methodological Concepts, International Data Availability, and Use Cases. Forecasting 2020, 2, 19. [Google Scholar] [CrossRef]
Celi, G.; Sica, E. Globalization and internal migration: Evidence from inter-provincial mobility in Vietnam. Reg. Stud. Reg. Sci. 2023, 10, 1–19. [Google Scholar] [CrossRef]
Van de Walle, E.; Henry, L. Multilingual Demographic Dictionary: English Section; Ordina Editions: Liege, Belgium, 1982. [Google Scholar]
Möhring, M. Tourism and Migration: Interrelated Forms of Mobility. Comparativ 2014, 24, 116–123. [Google Scholar] [CrossRef]
Skeldon, R. International Migration, Internal Migration, Mobility and Urbanization: Towards more Integrated Approaches; United Nations: New York, NY, USA, 2018. [Google Scholar]
Otoiu, A.; Titan, E.; Dumitrescu, R. Internal and international migration: Is a dichotomous approach justified? Procedia-Soc. Behav. Sci. 2014, 109, 1011–1015. [Google Scholar] [CrossRef] [Green Version]
Cirillo, M.; Cattaneo, A.; Miller, M.; Sadiddin, A. Establishing the link between internal and international migration: Evidence from Sub-Saharan Africa. World Dev. 2022, 157, 105943. [Google Scholar] [CrossRef]
Bernard, A.; Perales, F. Linking internal and international migration in 13 European countries: Complementarity or substitution? J. Ethn. Migr. Stud. 2022, 48, 655–675. [Google Scholar] [CrossRef]
Provenzano, D.; Baggio, R. The contribution of human migration to tourism: The VFR travel between the EU 28 member states. Int. J. Tour. Res. 2017, 19, 412–420. [Google Scholar] [CrossRef]
Marschall, S. Memory, Migration and Travel; Marschall, S., Ed.; Routledge: London, UK, 2018; pp. 1–23. [Google Scholar]
Ravenstein, E.G. The laws of migration. J. Stat. Soc. Lond. 1885, 48, 167–235. [Google Scholar] [CrossRef] [Green Version]
O’Reilly, K. Migration theories: A critical overview. In Routledge Handbook of Immigration and Refugee Studies; Routledge: Abingdon, UK, 2015; pp. 25–33. [Google Scholar]
Arango, J. Theories of international migration. In International Migration in the New Millennium; Ashgate: Aldershot, UK, 2017; pp. 25–45. [Google Scholar]
Lewis, G.J. Human Migration: A Geographical Perspective; Croom Helm: London, UK, 1982. [Google Scholar]
Sohst, R.; Tjaden, J.; de Valk, H.; Melde, S. The Future of Migration to Europe: A Systematic Review of the Literature on Migration Scenarios and Forecasts; International Organization for Migration: Geneva, Germany, 2020. [Google Scholar]
Kupiszewski, M.; Bijak, J.; Kicinger, A. The Use of International Migration Theories in Migration Forecasting—A Practical Approach. In International Migration and the Future of Populations and Labour in Europe; Kupiszewski, M., Ed.; The Springer Series on Demographic Methods and Population Analysis; Springer: Dor-drecht, The Netherlands, 2013; Volume 32. [Google Scholar] [CrossRef]
Salamo’nska, J. Quantitative methods in migration research. In Introduction to Migration Studies: An In-teractive Guide to the Literatures on Migration and Diversity IMISCOE Research Series; Scholten, P., Ed.; Springer International Publishing: Cham, Switzerland, 2022; pp. 425–438. [Google Scholar]
Chen, Y.; Rosenthal, S.S. Local amenities and life-cycle migration: Do people move for jobs or fun? J. Urban Econ. 2008, 64, 519–537. [Google Scholar] [CrossRef]
Demirel, D.F.; Basak, M. A fuzzy bi-level method for modeling age-specific migration. Socio-Econ. Plan. Sci. 2018, 68, 100664. [Google Scholar] [CrossRef]
Spyratos, S.; Vespe, M.; Natale, F.; Ingmar, W.; Zagheni, E.; Rango, M. Migration Data Using Social Media: A European Perspective; EUR 29273 EN; Publications Office of the European Union: Luxembourg, 2018; ISBN 978-92-79-87989-0. [Google Scholar] [CrossRef]
Beduschi, A. International migration management in the age of artificial intelligence. Migr. Stud. 2021, 9, 576–596. [Google Scholar] [CrossRef] [Green Version]
Smith, S.K. Accounting for migration in cohort-component projections of state and local populations. Demography 1986, 23, 127–135. [Google Scholar] [CrossRef] [PubMed]
Hyndman, R.J.; Booth, H. Stochastic population forecasts using functional data models for mortality, fertility and migration. Int. J. Forecast. 2008, 24, 323–342. [Google Scholar] [CrossRef]
Fuchs, J.; Söhnlein, D.; Vanella, P. Migration Forecasting—Significance and Approaches. Encyclopedia 2021, 1, 54. [Google Scholar] [CrossRef]
Gorbey, S.; James, D.; Poot, J. Population forecasting with endogenous migration: An application to trans-Tasman migration. Int. Reg. Sci. Rev. 1999, 22, 69–101. [Google Scholar] [CrossRef]
Bijak, J.; Wiśniowski, A. Bayesian forecasting of immigration to selected European countries by using expert knowledge. J. R. Stat. Soc. A 2010, 173, 775–796. [Google Scholar] [CrossRef]
Abel, G.; Bijak, J.; Findlay, A.; McCollum, D.; Wiśniowski, A. Forecasting environmental migration to the United Kingdom: An exploration using Bayesian models. Popul. Environ. 2013, 35, 183–203. [Google Scholar] [CrossRef]
Wiśniowski, A.; Bijak, J.; Shang, H.L. Forecasting Scottish migration in the context of the 2014 constitutional change debate. Popul. Space Place 2014, 20, 455–464. [Google Scholar] [CrossRef]
Azose, J.J.; Raftery, A.E. Bayesian probabilistic projection of international migration. Demography 2015, 52, 1627–1650. [Google Scholar] [CrossRef] [Green Version]
Raymer, J.; Wiśniowski, A. Applying and testing a forecasting model for age and sex patterns of immigration and emigration. Popul. Stud. 2018, 72, 339–355. [Google Scholar] [CrossRef]
Frees, E.W. Short-Term Forecasting of Internal Migration. Environ. Plan. A Econ. Space 1993, 25, 1593–1606. [Google Scholar] [CrossRef]
Ramos, R.; Surinach, J. A Gravity Model of Migration between ENC and EU. In IZA Discussion Papers, No. 7700; Institute for the Study of Labor (IZA): Bonn, Germany, 2013. [Google Scholar]
Campos, R.G. Migratory pressures in the long run: International migration projections to 2050. Banco De Esp. Artic. 2017, 38, 17. [Google Scholar]
Iancu, N.; Badulescu, A.; Urziceanu, R.M.; Iancu, E.A.; Simut, R. The use of the gravity model in forecasting the flows of emigrants in EU countries. Technol. Econ. Dev. Econ. 2017, 23, 392–409. [Google Scholar] [CrossRef]
Böhme, M.H.; Gröger, A.; Stöhr, T. Searching for a better life: Predicting international migration with online search keywords. J. Dev. Econ. 2020, 142, 102347. [Google Scholar] [CrossRef]
Frees, E.W. Forecasting state-to-state migration rates. J. Bus. Econ. Stat. 1992, 10, 153–167. [Google Scholar] [CrossRef]
Beer, J.D. Forecast intervals of net migration: The case of the Netherlands. J. Forecast. 1993, 12, 585–599. [Google Scholar] [CrossRef]
García-Guerrero, V.M. A probabilistic method to forecast the international migration of Mexico by age and sex. Pap. De Población 2016, 22, 113–140. [Google Scholar]
Schoumaker, B.; Beauchemin, C. Reconstructing trends in international migration with three questions in household surveys: Lessons from the MAFE project. Demogr. Res. 2015, 32, 983–1030. [Google Scholar] [CrossRef] [Green Version]
Vanella, P.; Deschermeier, P. A stochastic Forecasting Model of international Migration in Germany. In Familie—Bildung—Migration. Familienforschung Im Spannungsfeld Zwischen Wissenschaft, Politik Und Praxis. Tagungsband Zum 5. Europäischen Fachkongress Familienforschung; Kapella, O., Schneider, N.F., Rost, H., Eds.; Verlag Barbara Budrich: Opladen/Berlin, Germany; Toronto, ON, Canada, 2018; pp. 261–280. [Google Scholar]
Bijak, J.; Disney, G.; Findlay, A.M.; Forster, J.J.; Smith, P.W.; Wiśniowski, A. Assessing time series models for forecasting international migration: Lessons from the United Kingdom. J. Forecast. 2019, 38, 470–487. [Google Scholar] [CrossRef] [Green Version]
Vollset, S.E.; Goren, E.; Yuan, C.W.; Cao, J.; Smith, A.E.; Hsiao, T.; Bisignano, C.; Azhar, G.S.; Castro, E.; Chalek, J.; et al. Fertility, mortality, migration, and population scenarios for 195 countries and territories from 2017 to 2100: A forecasting analysis for the Global Burden of Disease Study. Lancet 2020, 396, 1285–1306. [Google Scholar] [CrossRef] [PubMed]
Shimizu, S.; Shin, S. Applicability of SARIMA Model in Tokyo Population Migration Forecast. In Proceedings of the 2021 14th International Conference on Human System Interaction (HSI), Gdańsk-Wrzeszcz, Poland, 8–10 July 2021; pp. 1–4. [Google Scholar] [CrossRef]
Fantazzini, D.; Pushchelenko, J.; Mironenkov, A.; Kurbatskii, A. Forecasting Internal Migration in Russia Using Google Trends: Evidence from Moscow and Saint Petersburg. Forecasting 2021, 3, 48. [Google Scholar] [CrossRef]
Kupiszewski, M. How trustworthy are forecasts of international migration between Poland and the European Union? J. Ethn. Migr. Stud. 2002, 28, 627–645. [Google Scholar] [CrossRef]
Cappelen, Å.; Skjerpen, T.; Tønnessen, M. Forecasting Immigration in Official Population Projections Using an Econometric Model. Int. Migr. Rev. 2015, 49, 945–980. [Google Scholar] [CrossRef]
Azose, J.J.; Sevcikova, H.; Raftery, A.E. Probabilistic population projections with migration uncertainty. Proc. Natl. Acad. Sci. USA 2016, 113, 6460–6465. [Google Scholar] [CrossRef]
Vasilyeva, A.V. The Forecast of Labour Migration, Reproduction of the Population and Economic Development of Russia. Econ. Reg. 2017, 13, 812–826. [Google Scholar] [CrossRef]
Shayegh, S.; Emmerling, J.; Tavoni, M. International Migration Projections across Skill Levels in the Shared Socioeconomic Pathways. Sustainability 2022, 14, 4757. [Google Scholar] [CrossRef]
Brücker, H.; Siliverstovs, B. On the estimation and forecasting of international migration: How relevant is heterogeneity across countries? Empir. Econ. 2006, 31, 735–754. [Google Scholar] [CrossRef] [Green Version]
Bahna, M. Predictions of Migration from the New Member States after Their Accession into the European Union: Successes and Failures. Int. Migr. Rev. 2008, 42, 844–860. [Google Scholar] [CrossRef]
Rogers, T.W. Migration Prediction On The Basis Of Prior Migratory Behavior: A Methodological Note. Int. Migr. 1969, 7, 13–19. [Google Scholar] [CrossRef]
Zagheni, E.; Garimella, V.R.K.; Weber, I.; State, B. Inferring international and internal migration patterns from Twitter data. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 439–444. [Google Scholar]
Plaut, T.R. An econometric model for forecasting regional population growth. Int. Reg. Sci. Rev. 1981, 6, 53–70. [Google Scholar] [CrossRef] [PubMed]
Ovchynnikova, O.; Nahornova, O.; Mylko, I.; Begun, S.; Buniak, N.; Kolenda, N. Forecasting Regional Migration Flows. In Proceedings of the 10th International Conference on Advanced Computer Information Technologies (ACIT), Deggendorf, Germany, 16–18 September 2020; pp. 165–169. [Google Scholar]
Jurdak, R.; Zhao, K.; Liu, J.; AbouJaoude, M.; Cameron, M.; Newth, D. Understanding Human Mobility from Twitter. PLoS ONE 2015, 10, e0131469. [Google Scholar] [CrossRef] [PubMed]
Carammia, M.; Iacus, S.M.; Wilkin, T. Forecasting asylum-related migration flows with machine learning and data at scale. Sci. Rep. 2022, 12, 1457. [Google Scholar] [CrossRef] [PubMed]
Terroso-Sáenz, F.; Muñoz, A. Nation-wide human mobility prediction based on graph neural networks. Appl. Intell. 2022, 52, 4144–4160. [Google Scholar] [CrossRef] [PubMed]
Terroso-Saenz, F.; Flores, R.; Muñoz, A. Human mobility forecasting with region-based flows and geotagged Twitter data. Expert Syst. Appl. 2022, 203, 117477. [Google Scholar] [CrossRef]
Terroso-Sáenz, F.; Muñoz, A.; Fernández-Pedauye, J.; Cecilia, J.M. Human Mobility Prediction With Region-Based Flows and Water Consumption. IEEE Access 2021, 9, 88651–88663. [Google Scholar] [CrossRef]
Gaigbe-Togbe, V.; Bassarsky, L.; Gu, D.; Spoorenberg, T.; Zeifman, L. World Population Prospects 2022; United Nations: New York, NY, USA, 2022; ISBN 978-92-1-148373-4. [Google Scholar]
Fertig, M.; Schmidt, C.M. Aggregate-level migration studies as a tool for forecasting future migration streams. In International Migration: Trends, Policy and Economic Impact; Institute for the Study of Labor: London, UK; New York, NY, USA, 2005; pp. 110–136. [Google Scholar]
Dustmann, C.; Casanova, M.; Fertig, M.; Preston, I.; Schmidt, C.M. The Impact of EU Enlargement on Migration Flows; Home Office Online Report 25/03; Research Development and Statistics Directorate, Home Office: London, UK, 2003; pp. 1–76. [Google Scholar]
Alvarez-Plata, P.; Brücker, H.; Siliverstovs, B. Potential Migration from Central and Eastern Europe into the EU-15: An Update; European Commission, Directorate-General for Employment and Social Affairs: Brussels, Belgium, 2003. [Google Scholar]
Dao, T.H.; Docquier, F.; Maurel, M.; Schaus, P. Global migration in the twentieth and twenty-first centuries: The unstoppable force of demography. Rev. World Econ. 2021, 157, 417–449. [Google Scholar] [CrossRef]
Burzynski, M.; Deuster, C.; Docquier, F. Geography of skills and global inequality. J. Dev. Econ. 2020, 142, 102333. [Google Scholar] [CrossRef]
Anderson, J.E. The gravity model. Annu. Rev. Econ. 2011, 3, 133–160. [Google Scholar] [CrossRef] [Green Version]
Beine, M.; Bertoli, S.; Fernández-Huertas Moraga, J. A practitioners’ guide to gravity models of international migration. World Econ. 2016, 39, 496–512. [Google Scholar] [CrossRef] [Green Version]
Hanson, G.; McIntosh, C. Is the Mediterranean the new Rio Grande? US and EU immigration pressures in the long run. J. Econ. Perspect. 2016, 30, 57–82. [Google Scholar] [CrossRef] [Green Version]
Bertoli, S.; Brücker, H.; Moraga, J.F.H. The European crisis and migration to Germany. Reg. Sci. Urban Econ. 2016, 60, 61–72. [Google Scholar] [CrossRef] [Green Version]
Sjaastad, L.A. The costs and returns of human migration. J. Political Econ. 1962, 70 Pt 2, 80–93. [Google Scholar] [CrossRef]
Backhaus, A.; Martinez-Zarzoso, I.; Muris, C. Do climate variations explain bilateral migration? A gravity model analysis. IZA J. Migr. 2015, 4, 3. [Google Scholar] [CrossRef] [Green Version]
Friebel, G.; Manchin, M.; Mendola, M.; Prarolo, G. International Migration Intentions and Illegal Costs: Evidence Using Africa-to-Europe Smuggling Routes. CEPR Discussion Paper No. DP13326. 2018. Available online: https://ssrn.com/abstract=3290517 (accessed on 20 March 2022).
Rikani, A.; Schewe, J. Global bilateral migration projections accounting for diasporas, transit and return flows, and poverty constraints. Demogr. Res. 2021, 45, 87–140. [Google Scholar] [CrossRef]
Beyer, R.M.; Schewe, J.; Lotze-Campen, H. Gravity models do not explain, and cannot predict, international migration dynamics. Humanit. Soc. Sci. Commun. 2022, 9, 56. [Google Scholar] [CrossRef]
Bijak, J. Forecasting International Migration in Europe: A Bayesian View; Springer Science+Business Media: Dordrecht, The Netherlands; Heidelberg, Germany; London, UK.; New York, NY, USA, 2011. [Google Scholar]
Wiśniowski, A.; Smith, P.W.F.; Bijak, J.; Raymer, J.; Forster, J.J. Bayesian Population Forecasting: Extending the Lee-Carter Method. Demography 2015, 52, 1035–1059. [Google Scholar] [CrossRef] [Green Version]
Lutz, W.; Goldstein, J.R. Introduction: How to deal with uncertainty in population forecasting? Int. Stat. Rev. 2004, 72, 1–4. [Google Scholar] [CrossRef]
Billari, F.C.; Graziani, R.; Melilli, E. Stochastic population forecasts based on conditional expert opinions. J. R. Stat. Soc. Ser. A (Stat. Soc.) 2012, 175, 491–511. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Billari, F.C.; Graziani, R.; Melilli, E. Stochastic Population Forecasting Based on Combinations of Expert Evaluations Within the Bayesian Paradigm. Demography 2014, 51, 1933–1954. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mitchell, T. Machine Learning; McGraw Hill: New York, NY, USA, 1997. [Google Scholar]
Robinson, C.; Dilkina, B. A machine learning approach to modeling human migration. In Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies, Menlo Park and San Jose, CA, USA, June 2018; pp. 1–8. [Google Scholar] [CrossRef] [Green Version]
Tarasyev, A.A.; Agarkov, G.A.; Hosseini, S.I. Machine learning in labor migration prediction. In Proceedings of the AIP Conference, Thessaloniki, Greece, 25–30 September 2017; AIP Publishing LLC: Melville, NY, USA, 2018; Volume 1978, p. 440004. [Google Scholar] [CrossRef]
Kiossou, H.S.; Schenk, Y.; Docquier, F.; Houndji, V.R.; Nijssen, S.; Schaus, P. Using an interpretable Machine Learning approach to study the drivers of International Migration. arXiv 2020, arXiv:2006.03560. [Google Scholar]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Biau, G.; Scornet, E. A random forest guided tour. TEST 2015, 25, 197–227. [Google Scholar] [CrossRef] [Green Version]
Aoga, J.; Bae, J.; Veljanoska, S.; Nijssen, S.; Schaus, P. Impact of weather factors on migration intention using machine learning algorithms. arXiv 2020, arXiv:2012.02794. [Google Scholar]
Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Luo, L.; Hu, L.; Sun, M. An SVM-based classification model for migration prediction of Beijing. Eng. Lett. 2020, 28, 1023–1030. [Google Scholar]
Golenvaux, N.; Alvarez, P.G.; Kiossou, H.S.; Schaus, P. An LSTM approach to Forecast Migration using Google Trends. arXiv 2020, arXiv:2005.09902. [Google Scholar]
Simini, F.; Barlacchi, G.; Luca, M.; Pappalardo, L. A Deep Gravity model for mobility flows generation. Nat. Commun. 2021, 12, 6576. [Google Scholar] [CrossRef]
Bijak, J. Migration Forecasting: Beyond the Limits of Uncertainty; Global Migration Data Analysis Centre Data Briefing Series: Berlin, Germany, 2016; Issue 6; pp. 1–7. [Google Scholar]
Sirbu, A.; Andrienko, G.; Andrienko, N.; Boldrini, C.; Sharma, R. Human migration: The big data perspective. Int. J. Data Sci. Anal. 2021, 11, 341–360. [Google Scholar] [CrossRef]

Figure 1. The flow of the human migration forecasting framework.

Figure 2. Key technology roadmap for population migration prediction. Source: results of systematic literature search by author.

Figure 3. Human migration research using various machine-learning methods.

Table 1. Classification of human migration forecasting methods.

Method	Model	Data Source	Spatial Attributes	Temporal Attributes	References
Deterministic method	--	US Department of Health and Human Services and US Bureau of the Census statistics	internal migration	long-term	[42]
Deterministic method	--	Human Mortality Database; Federal Statistical Office of Germany	international migration	long-term	[43,44]
Stochastic methods	Bayesian Model	Statistics New Zealand online database INFOS and Australian Bureau of Statistics database	international migration	short-term	[45]
	Bayesian Model	Eurostat, United Nations Statistics Division statistics and the Council of Europe’s Demographic Yearbooks; Office for National Statistics census data; National Records of Scotland and the Office for National Statistics census data; United Nations Population Division’s biennial World Population Prospects report; Statistics Sweden census data; Korean Statistical Information Service statistics, Australian Bureau of Statistics census data.	international migration	long-term	[46,47,48,49,50]
	Gravity Model	Internal Revenue Service statistics;	internal migration	long-term	[51]
		Internal Revenue Service statistics; World Bank’s Global Bilateral Migration database; Eurostat statistics, World DataBank statistical data (2016), Organization for Economic Co-Operation and Development statistics, French Centre d’Etudes Prospectives et d’Informations Internationals statistics;	international migration	long-term	[52,53,54]
		Google Trends data, Organization for Economic Co-Operation and Development statistics.	international migration	short-term	[55]
	Time Series Model	Internal Revenue Service statistics	internal migration	short-term	[56]
		Netherlands Central Bureau of Statistics census data	international migration	short-term	[57]
		Mexican demographic surveys of households and American Community Survey data; Household surveys data; Federal Statistical Office of Germany census data	international migration	long-term	[58,59,60]
		Internal Revenue Service statistics; International migration report (2017)	international migration	long and short-term	[61,62]
		Author-collected datasets; Google Trends data	internal migration	short and long-term	[63,64]
	Econometrics Method	Trends in International Migration statistics, Migration Potential in Central and Eastern Europe statistics; Statistics Norway’s “Statbank”; World Population Prospects (2015); Russian Federation and Commonwealth of Independent States countries statistics; World Bank Global Bilateral Migration	international migration	long-term	[65,66,67,68,69]
		Organization for Economic Co-Operation and Development statistics, World Bank statistics, Federal Statistical Office census data; the Candidate Country Eurobarometer survey series; Household surveys data; Twitter; Google Trends data;	international migration	short-term	[55,70,71,72,73]
		United States Bureau of the Census da-ta;	internal migration	long-term	[74]
		Site of the Main Department of Statistics in the Khmelnytskyi Region statistics; Twitter; data collected by the author.	internal migration	short-term	[11,75,76]
	Grey Model	China Statistical Yearbook; US Department of Health and Human Services and US Bureau of the Census statistics.	international migration	long-term	[12,13]
Machine-learning method	Classical Machine-Learning Method	Data collected by the author;	internal migration	long-term	[14]
		Department of Overseas Labor statistics	international migration	long-term	[16]
		Household survey data;	internal migration	--	[17]
		Mexican Migrant Project statistics; Google Trends Index.	international migration	short-term	[15,18]
	Deep-Learning Method	Nationwide Human Mobility Dataset released by the Spanish Ministry of Transportation;	internal migration	short-term	[77,78,79]
	Deep-Learning Method	Google Trends Index; Organization for Economic Co-Operation and Development International Migration Database	international migration	long-term	[80]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pu, T.; Huang, C.; Yang, J.; Huang, M. Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction. Sustainability 2023, 15, 10584. https://doi.org/10.3390/su151310584

AMA Style

Pu T, Huang C, Yang J, Huang M. Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction. Sustainability. 2023; 15(13):10584. https://doi.org/10.3390/su151310584

Chicago/Turabian Style

Pu, Tongzheng, Chongxing Huang, Jingjing Yang, and Ming Huang. 2023. "Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction" Sustainability 15, no. 13: 10584. https://doi.org/10.3390/su151310584

APA Style

Pu, T., Huang, C., Yang, J., & Huang, M. (2023). Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction. Sustainability, 15(13), 10584. https://doi.org/10.3390/su151310584

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transcending Time and Space: Survey Methods, Uncertainty, and Development in Human Migration Prediction

Abstract

1. Introduction

2. Research: Related Works and Methodology

2.1. Theoretical Review

2.2. Problem Statement

2.3. Research Methodology

3. Human Migration Forecasting Approaches

3.1. Deterministic Methods

3.2. Stochastic Methods

3.2.1. Econometric Forecasting Models

3.2.2. Gravity Model

3.2.3. Time-Series Models

3.2.4. Bayesian Prediction Model

3.2.5. Expert Prediction Model

3.3. Machine-Learning Prediction Methods

3.3.1. Classical Machine-Learning Prediction Method

3.3.2. Deep-Learning Prediction Method

4. Uncertainty in Population Migration Projections

5. Conclusions

6. Outlook

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI