1. Introduction
By 2050, about 7 in 10 people will live in cities [
1]. With the increasing urban population, urban transportation is expanding rapidly, which poses new challenges to the sustainable development of cities. Compared with private vehicles, urban rail transit can reduce transport-related energy consumption, travel costs, traffic congestion, and environmental pollution. Meanwhile, studies show that the growth of car ownership is relatively slow in cities with high urban rail transit intensity [
2]. Therefore, the subway, bus, and other public transport facilities play a more important role in realizing the sustainable development of cities. Among them, the subway transportation system is an important measure to eliminate the shackles of urban traffic, alleviate urban traffic congestion, and build an urban three-dimensional transportation system due to a large traffic flow, fast operation speed, and small floor area [
3]. Besides, the subway emits fewer pollutants and saves more energy, which can help promote the green and healthy development of the city.
Therefore, a timely and effective subway transportation system is very crucial. To avoid traffic congestion or traffic paralysis caused by insufficient subway resources, we can utilize passenger flow prediction to realize an effective allocation of traffic resources. Meanwhile, with the wide adoption of location-aware technology, urban spatiotemporal data (such as bus card data, subway card swiping data, mobile GPS, etc.) becomes abundant [
4]. Therefore, the passenger flow can be extracted, which is helpful in studying human activity patterns. Intelligent passenger flow prediction based on big data can also facilitate the development of the intelligent transportation system (ITS) [
5]. In summary, it is achievable and imperative to develop a deep learning framework for subway passenger flow prediction.
Due to the great practical value and challenge of subway passenger flow prediction, researchers devote their energy more to the research of traffic flow prediction. The existing passenger flow prediction methods can be mainly classified into statistical methods, machine learning (ML) methods, and deep learning (DL) methods.
The statistical methods determine the parameters through the processing of the original data and realize the traffic prediction according to the regression function [
6], such as the autoregressive integral moving average model (ARIMA) [
7,
8] and its variations [
9,
10,
11], and the Kalman filter model [
12,
13,
14]. They can easily calculate and capture the linear characteristics of the data. However, they rely on stationary assumptions and can be largely influenced by the fluctuant traffic data. In addition, they are difficult to reflect the non-linear and complex characteristics of the traffic flow.
The machine learning methods can obtain the non-linear features and statistical laws of the traffic data through sufficient historical observations, which can handle the problem of the statistical methods, such as the support vector regression (SVR) [
15,
16,
17], k-nearest neighbor model (KNN) [
18,
19,
20], Bayesian model [
21,
22], support vector machine (SVM) [
23,
24,
25], and fuzzy logic model [
26,
27]. They improve the accuracy of the subway flow prediction, but it is difficult for them to achieve good results in complex networks with numerous nodes. They mostly rely on the complex manual feature engineering, which results in the lack of robustness to model massive data, and they are incapable of processing raw spatiotemporal data [
28]. Therefore, it is difficult for the machine learning methods to obtain the best prediction results based on the abundant spatiotemporal data.
The deep learning methods can automatically establish feature engineering and improve the feature expression. Moreover, deep learning models have advantages in capturing non-linear and complex patterns, which can help them to get more accurate results. At present, the deep learning methods commonly used in spatiotemporal traffic flow prediction, such as the recurrent neural network (RNN) [
29,
30,
31], the convolutional neural network (CNN) [
32], the graph neural network (GNN) [
33,
34], etc. Traffic flow prediction essentially depends on historical observations. Therefore, temporal dependency is an indispensable part. However, some deep learning models [
35,
36] only consider the temporal dependency of passenger flow and ignore the spatial dependency. In this way, the traffic prediction is divorced from spatial factors, such as roads and stations. By integrating the spatial dependency, we can further improve the accuracy of the model. Therefore, aiming at the shortcomings of the single model in the passenger flow prediction, some studies [
37,
38,
39,
40] introduce the CNN to model spatial dependency, combining it with the RNN [
41] model and its variant models [
42,
43]. However, due to the non-Euclidean and time-varying characteristics of the subway network, it is difficult for the CNN to describe the complex spatial topological relationship. Therefore, some deep learning models [
44,
45,
46] introduce the graph convolutional neural network (GCN) [
47,
48] to improve the capture of spatiotemporal features in the passenger flow. At the same time, the RNN models have limitations for capturing the temporal dependency. The attention model [
49] can capture the global and dynamic spatiotemporal characteristics, which is helpful for the prediction. Some deep learning models [
50,
51,
52,
53] introduce the attention model into the field of traffic flow prediction.
Much progress has been made in this task. However, some knowledge gaps still exist. Statistical methods are difficult to capture complex characteristics. Machine learning methods heavily depend on the manually designed characteristics. As for the deep learning methods, the existing models still have the following gaps:
- (1)
Most methods based on the GCN ignore the improvements to the adjacent matrix in subway passenger flow prediction. Firstly, they ignore the spatial influence of the import and export of subway stations. Secondly, they ignore the influence of global stations on specific stations.
- (2)
Most methods are based on a single model to capture the temporal dependency, such as the RNN model and its variations, or the Transformer [
54] model. However, these models still have limitations in capturing all the temporal characteristics. The RNN model and its variations only focus on the capture of local temporal characteristics, while the Transformer model only focuses on the capture of global temporal characteristics.
- (3)
Traffic prediction is commonly classified into two scales, which are short-term (≤30 min) and long-term (≥30 min) [
44,
53]. At present, the subway passenger flow prediction is mostly a short-term prediction. However, the long-term flow prediction is also very important, which can provide more sufficient preparation for operation scheduling.
In summary, to solve the above problems and better predict the subway passenger flow, the GCTN is proposed. It can comprehensively model the local and global spatiotemporal dependency. Specifically, the original characteristics are obtained through the diffusion of passenger flow features by a CNN. We believe temporal feature modeling based entirely on captured spatial features will ignore part of the passenger flow characteristics. Therefore, the original characteristics and the characteristics captured by GCN are fused as the basis to capture the temporal characteristics. Later, the features containing original and spatial features were input into the LSTM and Transformer. The LSTM can capture the local temporal features, and the Transformer can capture the global temporal features. We integrate the local and global temporal characteristics to obtain comprehensive temporal characteristics. Then, to model synthesize the spatiotemporal dependency, and balance the spatial and temporal influence, we further fused the spatial characteristics into the comprehensive temporal characteristics. Therefore, we think the model can improve the accuracy and stability of long-term prediction.
After exploring the subway passenger flow data, we divide the subway passenger flow data into three patterns: close, daily, and weekly. This paper tries several fusion methods to explore the influence of the nearest neighbor and periodic segments. Compared with the current subway passenger flow prediction model, the proposed hybrid model in this paper has the following advantages:
Through the improvements in the adjacent matrix and GCN, the expression of the spatial structure in the subway network is further obvious. Besides, we can better describe that station characteristic and the influence of global stations, which improves the capture of spatial characteristics.
Modeling the spatiotemporal dependency comprehensively. We design a spatiotemporal block structure, and the objective is that seamlessly model the characteristics extracted by the network.
An improvement is made in the Transformer model. The CNN is added to extract and fuse the intermediate features, which can help better analyze the subway time-series data.
The motivation behind the GCTN is accurately predicting long-term subway passenger flow. Passenger flow prediction can help subway dispatching, and it can assist citizens in planning routes and scheduling their time. Moreover, it can help reduce traffic pressure and construct a healthy, green, and sustainable city.
The rest of this paper is organized as follows. In
Section 2, we describe the mathematical expression of the prediction problem. Then, we introduce the model architecture in
Section 3. In
Section 4, we show the results in case analysis and discuss the influence factors of prediction results. Finally, in
Section 5, we summarize the results and limitations of this paper and explore the future research direction.
2. Problem Definition
The subway passenger flow prediction is a problem of spatiotemporal prediction. First of all, the spatial structure in the subway can be expressed as a graph structure,
.
is the set of N nodes representing the subway stations.
is the edge between subway stations. In addition,
is the adjacent matrix based on the connectivity and Euclidean distance between subway stations. Moreover, subway passenger flow is not only affected by the spatial structure of the subway network, but also by the passenger flow in historical periods. Therefore, the prediction problem can be expressed by the following Equation (1):
where
is the predicted subway passenger flow of
N stations in time
T′. The
is the historical subway passenger flow on
N stations in time
T. The
and
both include inflow volume and outflow volume.
is the deep learning model.
is the graph structure of the subway network, and
is the learnable parameter.
In this paper, historical subway passenger flow is divided into three patterns (close, daily, and weekly). The close pattern represents the recent time, while the daily and weekly patterns represent the historical situation of passenger flow at the target time on different days or weeks. For example, if we predict the passenger flow in the next time slice. The close patterns of the target data are the previous few time slices before the target time. The daily patterns are the time slices at the same time in the previous few days. In addition, the weekly patterns are the time slices at the same time in the previous few weeks. Different patterns assist in studying the influence of periodic time slices on subway passenger flow at the target time. The prediction process is shown in
Figure 1.
This paper builds a hybrid neural network model GCTN (graph convolutional and comprehensive temporal neural network) to solve the problem of prediction in subway passenger flow. In addition to capturing the spatiotemporal dependency, the proposed model considers the correlation between stations with different time steps to strengthen the long-term passenger flow prediction.
At present, the commonly used long-term prediction method is autoregressive multi-step prediction. However, the predicted error will spread and gradually accumulate in the autoregressive multi-step prediction. Therefore, we directly model long-term temporal dependency from historical data, which is used to predict multiple target flows. Meanwhile, to avoid error diffusion, we carry out parallel training through the joint loss of multiple time steps, which may achieve a more accurate multi-step prediction. The model is tested and verified by the passenger flow data in time intervals (TIs) of 10 min, 20 min, and 30 min.
5. Conclusions
Aiming at the problem of subway passenger flow prediction, we argue that the existing works ignore the spatial influence of the import and export of subway stations and the influence of global stations on specific stations. In addition, many methods are based on a single RNN model or its variations, or the Transformer model, which has limitations in capturing temporal features. This paper proposed a hybrid neural network, GCTN, to solve these problems.
We use Shanghai subway passenger flow data to test. The results show that MAE, RMSE, and WMAPE have achieved good performance in multi-step prediction. The GCTN has a better prediction performance in capturing the peak and the period when passenger flow changes quickly, which is more conducive to the practical application of the model. We compared the effects of different temporal combination modes, which show the combination of the close, daily, and weekly patterns can improve the prediction performance. Meanwhile, we verified the proposed improvements in the GCTN, and we think the combination of the CNN and Transformer is helpful. We found that the RMSE of the predicted flow of the station is positively correlated with import and export quantity and POI quantity.
However, some limitations still exist. Firstly, the period of the validation data set is not long enough to study the influence of long-temporal factors, such as seasons. Secondly, the influence of external factors, such as weather is not considered. Finally, we do not study the dynamic spatial characteristics, which may improve the spatial dependency. In the future, we will further explore the influence of external features and study the application of the GCTN in longer data sets. We also intend to study the influence of dynamic spatiotemporal features in the deep learning model and the differences between different kinds of attention mechanisms in global temporal features.