EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration

Han, Shi-Yuan; Zhao, Qiang; Sun, Qi-Wei; Zhou, Jin; Chen, Yue-Hui

doi:10.3390/app12062890

Open AccessArticle

EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration

by

Shi-Yuan Han

^*

,

Qiang Zhao

,

Qi-Wei Sun

,

Jin Zhou

and

Yue-Hui Chen

Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(6), 2890; https://doi.org/10.3390/app12062890

Submission received: 3 February 2022 / Revised: 27 February 2022 / Accepted: 9 March 2022 / Published: 11 March 2022

(This article belongs to the Special Issue Advances in Vehicle Technology and Intelligent Transport Systems)

Download

Browse Figures

Versions Notes

Abstract

:

An accurate and reliable forecast for traffic flow is regarded as one of the foundational functions in an intelligent transportation system. In this paper, a new model for traffic flow forecasting, named EnGS-DGR, is designed based on ensemble learning of graph convolutional network (GCN), sequence-to-sequence (Seq2Seq) learning model, and dynamic graph reconfiguration (DGR) algorithm. At the first stage, instead of employing entire nodes in the traffic network, the DGR algorithm is proposed to reconstruct the traffic graph topology consisting of traffic nodes with tight correlation under a specific forecasting interval, where the degree of correlation among the traffic nodes is quantized from the perspective of multi-view clustering. At the second stage, GCN-Seq2Seq integration strategy is introduced to extract the data of the spatio-temporal dependence and forecast traffic flow. We applied the proposed EnGS-DGR to two different datasets from the highways of Los Angeles County and of California’s Bay Area; the simulation results show that the proposed EnGS-DGR is superior to other eight popular models for traffic flow forecasting in terms of three common performance metrics.

Keywords:

traffic flow forecasting; GCN; Seq2Seq; dynamic graph reconfiguration; traffic network

1. Introduction

Currently, due to the development of advanced traffic sensing and artificial intelligence technology, Intelligent Transportation Systems (ITSs) have been constantly developed in order to improve the transportation management and the travel guidance [1,2]. On the basis of the massive traffic data, traffic flow forecasting could provide fundamental support for the basic functions of ITSs, such as traffic congestion control, the resource allocation, and personalized trip recommendations [3,4]. The existing method for traffic flow forecasting can be roughly divided into three categories: early traffic flow forecasting based on statistical analysis, traffic flow forecasting based on machine learning models, and traffic flow forecasting based on deep learning models.

At the early stage, the majority of the methods for traffic flow forecasting are related to time series forecasting. The forecasting results are attained by statistical analysis, such as by the history average (HA) model, the autoregressive integrated moving average (ARIMA) model [5,6], the Kalman filtering model [7], etc. As a typical statistical model, the original HA model makes the average traffic flow value in the past forecasting interval the forecasting result, which has also been widely used in the early European path planning system [8]. Although the HA model is simple and fast, its forecasting accuracy is low due to the nonlinear characteristics of traffic flow. Thus, the ARIMA model was proposed to forecast the traffic flow in 1976 [9], which relies on a large amount of historical data but is not precise in forecasting the non-stationary traffic flow. The vector auto-regression model (VAR) is suitable for multiple time series interact in forecasting complex traffic flows [10]. Therefore, the statistical analyzing models cannot capture the nonlinear characteristics of traffic flow.

Fortunately, machine learning methods, such as support vector machine (SVM) [11,12] and K-nearest neighbor [13,14], can improve the effectiveness of extracting the temporal dependence characteristics of traffic flow. The K-nearest neighbor method forecasts future traffic flows based on the similarity between the historical and current traffic flow states [15]. According to this method, a high forecasting accuracy in terms of the time series forecasting with a high time complexity can be obtained. However, once some erroneous data appear in the traffic data, its accuracy may deteriorate. The traffic forecasting method based on the support vector regression (SVR) is especially effective in dealing with non-linear traffic [16]. The models for forecasting SVM-based traffic flow are usually sensitive to the adjustment of parameter and the selection of sum function [17]. The machine-learning-based methods for forecasting traffic flows are more accurate in forecasting than the statistical models that only capture the nonlinear characteristics of traffic flow. However, in these models only the temporal dependence of traffic flow are taken into consideration.

In the past, some scholars only used the deep learning method based on a feedforward neural network (FNN) to forecast traffic flow, but it was difficult to achieve satisfactory results under the complex spatial-temporal characteristics of traffic flow [18]. Ref. [19] used the ensemble empirical mode decomposition (EEMD) model to suppress noise in the original traffic flow data and input traffic flow data into the artificial neural network (ANN) to forecast traffic flow with different step sizes, but the temporal and spatial characteristics of traffic flow was not taken into consideration. The deep-learning methods have been widely employed to capture the spatial-temporal correlation of traffic flow in recent years. Recurrent neural network (RNN) and convolutional neural network (CNN) methods have been applied to traffic flow forecasting. Combining the gated recurrent unit (GRU), the long short-term memory (LSTM), and the variants of RNN, ref. [20] proposed an effective model to forecast the traffic time series from the perspective of temporal dependence. A multi-feature forecasting model was also proposed in [21] based on CNN to include a variety of temporal and spatial features and external factors. Having taken the spatial-temporal correlation of taxi demand allocation into account, ref. [22] proposed a combination model of CNN and LSTM to forecast taxi demand allocation in cities, which was only applicable to grid data in European space and not suitable for graph structure data of traffic network in non-Euclidean space. In recent years, the appearance of the graph neural network (GNN) has attracted more attention from many scholars and has been applied to traffic flow forecasting in a non-Euclidean space. Ref. [23] proposed the diffusion convolutional recurrent neural network (DCRNN), which transformed the traffic flow model into a random walk on the graph structure and captured the time features through encoder–decoder architecture. A temporal graph convolution network (T-GCN) model was proposed in [24], in which GCN was employed to capture spatial dependence, and the combined modular of GRU and advanced RNN was used to capture the temporal dependence of traffic flow. Ref. [25] proposed a spatio-temporal graph convolutional networks (STGCN) method based on graph structure, which captures the temporal and spatial correlation of traffic flow using convolution structure. Ref. [26] proposed an integrated depth graph reinforcement learning network, which preprocessed the original traffic dataset as input, forecast with convolution network and attention network, and finally optimized the weight coefficients of the two forecasting using reinforcement learning to generate the final forecasting results. Ref. [27] proposed an integrated in-depth learning spatial-temporal forecasting model for urban hot spots, which combines geographic forecasting with semantic forecasting neural network for a joint forecasting. Ref. [28] proposed the attention-based spatial-grey-graph neural network-long short-term memory neural network (AST-GCN-LSTM) model. The model improves the graph convolution network and expands the neighborhood of convolution. Although the above GCN-based models have comprehensively considered the spatial-temporal correlation characteristics of traffic flow, the most existing deep-learning models often require the extraction of spatial features on the entire traffic network with fixed nodes and traffic network topology. However, the association relationships among the traffic nodes are not invariable with different forecasting intervals. The spatial-temporal correlation characteristics for forecasting traffic flow are shown in Figure 1.

Motivated by integrating the deep learning method with multi-view subspace clustering algorithm, this paper proposes an ensemble learning model named EnGS-DGR based on a dynamic graph reconfiguration algorithm with a specific forecasting interval, and an integration model combining the GCN with the Seq2Seq learning modular. The main contributions of this paper are described as follows.

(1): The traffic graph topology with closely-correlated traffic nodes is formulated with a specific forecasting interval on the basis of the designed dynamic graph reconfiguration algorithm, in which the method of the low-rank tensor constrained multi-view subspace clustering is employed for a quantitative analysis of the correlation degree among the traffic nodes.
(2): Taken the time series obtained from reconfigured graph as the input, a GCN-Seq2Seq integration strategy is proposed to extract the spatio-temporal correlation characteristics, in which the integration information involved with the local information of traffic nodes and the overall spatial characteristics extracted by GCN modular deliver to Seq2Seq modular for obtaining the traffic flow forecasting.

The remainder of this paper is organized as follows: In Section 2, the proposed EnGS-DGR model is described in detail. Section 3 verifies the proposed EnGS-DGR model and analyzes the experimental results. Section 4 summarizes this paper.

2. Methodology

2.1. Problem Definition

In this subsection, the traffic flow forecasting problem will be introduced briefly.

The underlying traffic network can be represented by a directed graph

G (V, E, A)

, where V is the vertex set composed of traffic monitors with

|V| = N

; E is the edge set composed of road segments connected by traffic monitors;

A \in R^{|E| \times |E|}

is the adjacency matrix of traffic network, where

|E|

denotes the cardinality of edge set of E. In detail,

A_{i j} = 1 / d_{i j}

represents the weight between nodes

v_{i}

to

v_{j}

along the driving direction with the road section length

d_{i j}

.

The spatio-temporal variables matrix

[G_{t - n}, \dots, G_{t - 1}, G_{t}]

is employed to denotes the traffic flow sequences distributed over the traffic network, where n refers to the look-back time windows;

G_{t} = [v_{t}^{1}, \dots, v_{t}^{|E| - 1}, {v_{t}}^{|E|}]

. Thus, the traffic flow forecasting problem is formulated as a multivariate time series problem to estimate the traffic flow at a future forecasting interval T based on the historical traffic data collected over previous forecasting intervals from the overall traffic network with N traffic monitors.

While defining the traffic network topology structure

G (V, E)

and relevant previously monitor data until t, the traffic flow forecasting problem is tackled to to forecast the traffic flow

[Y_{t + 1}, \dots, Y_{t + T}]

under forecasting interval T. Its definition is shown in Figure 2, and its formula is as follows:

[Y_{t + 1}, \dots, Y_{t + T}] = f (X_{t - n}, \dots, X_{t - 1}, X_{t}) .

(1)

The direct forecasting method [29,30] used in this paper is shown in Figure 3. The historical value

(X_{t - n}, X_{t - n + 1}, \dots, X_{t - 1}, X_{t})

is input into the forecasting model to forecast future traffic at once of T-fold resolution, i.e.,

(Y_{t + 1}, \dots, Y_{t + T - 1}, Y_{t + T})

.

Remark 1.

The arbitrary forecasting interval T is predefined before training the forecasting model. Meanwhile, the sampling period in datasets could be the same as the predefined forecasting interval. It is assumed that the original sampling period could be small enough to obtain the required datasets.

2.2. EnGS-DGR Model

In this subsection, EnGS-DGR model is proposed to capture the spatial-temporal correlation characteristics of traffic flow, which is mainly composed of two parts: a dynamic graph reconfiguration algorithm with a specific forecasting interval and a deep-learning integration strategy combining the graph convolutional network (GCN) with the sequence-to-sequence (Seq2Seq) learning modular. The framework of the proposed EnGS-DRG model is shown in Figure 4.

Firstly, the multi-view subspace clustering theorem is adopted to evaluate and quantify the correlation degrees of each traffic node with the specific forecasting interval, then a subgraph is reconfigured by taking into consideration the traffic nodes with correlation degrees higher than a defined threshold. Taken the time series of reconfigured subgraph as the input, the overall spatial characteristics of traffic network are aggregated by GCN. Fusing the output of GCN with the local information of the selected nodes as the input, we adopt the Seq2Seq modular to extract the temporal characteristics. In the end, by mixing the obtained spatial-temporal correlation characteristics through cross correlation coefficients, we obtain the traffic flow forecasting result. The special details of the EnGS-DRG model are described as follows:

2.2.1. Dynamic Graph Reconfiguration Algorithm

The data is preprocessed to obtain the same time interval with predefined forecasting interval first, in which the mean interpolation is used to fill in missing values for missing data in the dataset. Thus, the processed data is standardized and used by the DGR algorithm.

Instead of the Laplacian Matrix, a low-rank tensor-constrained multi-view subspace representation algorithm is employed to reveal the correlations between the traffic nodes in the road network. Compared with the Laplacian matrix, which can only be applied to the single-feature datasets, this algorithm is more adaptable to the datasets with multi-dimensional features. By defining various features of each traffic node in the road network as different views, the DGR algorithm is used to jointly learn the subspace representations under each feature view in order to mine the high-order correlations within and between the views at the same time. This algorithm is expected to reveal not only the inner associations of views, but also the correlations between the views, in which the low-rank constraint is used to represent the subspace and reduce the size of subspace representation. The specific algorithm flow is shown in Figure 5.

The dataset containing multi-dimensional features is represented by the multi-view subspace. The data from entire nodes in traffic network

G (V, E)

can be represented as

X = [X^{(1)}, X^{(2)}, \dots, X^{(t)}, \dots, X^{(N)}]

, where

X^{(t)} \in R^{S_{dim} \times N}

is a matrix composed of tth feature,

S_{dim}

is the length of the feature vector (e.g., average travel speed, distance to other nodes), and N is the number of nodes. Coefficient matrix Z of the every view can be solved by the following formula:

\begin{matrix} min_{Z^{(b)}, K^{(b)}} {∥Z∥}_{*} + λ {∥K∥}_{2, 1}, \\ s . t . X^{(b)} = X^{(b)} Z^{(b)} + K^{(b)}, b = 1, 2, \dots, B, \\ Z = ψ (Z^{(1)}, Z^{(2)}, \dots, Z^{(B)}), \\ K = [K^{(1)}; K^{(2)}; \dots; K^{(B)}], \end{matrix}

(2)

where

X^{(b)}

represents the data matrix of the vth view;

Z^{(b)}

is a coefficient matrix of

X^{(b)}

;

ψ (\cdot)

combines different

Z^{(b)}

into tensor

Z

with dimension

N \times N \times B

;

Z \in R^{I_{1} \times I_{2} \times \dots \times I_{M}}

is a M-order tensor;

{∥.∥}_{*}

represents the tensor under low rank constraint, that is, the tensor nuclear norm; K is a reconstruction error matrix, which connects each

K^{(b)}

together vertically;

{∥K∥}_{2, 1}

is the

l_{2, 1}

norm, and the clustering robustness can be improved by using the

l_{2, 1}

norm. The objective function is to find the lowest rank subspace expression matrix from many candidate solutions through joint optimization of all perspectives.

The low-rank constraint of the multi-rank tensors can be expressed by the tensor nuclear norm:

{∥Z∥}_{*} = \sum_{m = 1}^{M} ξ_{m} {∥Z_{m}∥}_{*},

(3)

where

ξ_{m}

is a constant and

ξ_{i} > 0

,

\sum_{m = 1}^{M} ξ_{m} = 1

,

Z_{(m)}

is the matrix of tensor

Z

expanded along m mode. By replacing

{∥Z∥}_{*}

of (3) in the objective function, the optimization problem can be transformed as:

min_{Z^{(b)}, K^{(b)}} {∥K∥}_{2, 1} + \sum_{m = 1}^{M} ϖ_{m} {∥Z_{m}∥}_{*}

(4)

where

ϖ_{m} = \frac{ξ_{m}}{λ} > 0

represents the constraint strength of the low-rank tensor.

By using the Alternating Direction Method of Multipliers (ADMM) algorithm of the extended Augmented Lagrange Method (ALM) algorithm in [31] to solve the objective function (4) optimization problem, tensor

Z

can be obtained. Then all subspace representations of each view are combined by

Q = \frac{1}{v} \sum_{v = 1}^{V} |Z^{(v)}| + |{Z^{(v)}}^{T}|

, where the resulting matrix Q is the required affinity matrix,

Z^{(v)}

is the coefficient matrix of vth view, V is the number of views. The resulting affinity matrix Q is expressed as:

Q = (\begin{matrix} 1 & q_{12} & \dots & \dots & q_{1 N} \\ q_{21} & 1 & q_{2 N} \\ ⋮ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋮ \\ q_{N 1} & q_{N 2} & \dots & \dots & 1 \end{matrix})

(5)

where

q_{i j}

represents the quantized degree of correlation between the nodes i and j. The greater the element value of matrix

Q_{i j}

, the stronger the correlation between the nodes i and j. Based on the scale of traffic flow forecasting, an appropriate threshold is selected. Setting the appropriate threshold as

q^{H}

, a subgraph is constructed by selecting the traffic nodes with higher correlation degree than

q^{H}

, which is described as:

G_{s u b} = (V_{s u b}, E_{s u b}, A_{s u b})

(6)

where

V_{s u b}

is the node set consisted of selected nodes;

E_{s u b}

is the edge set composed of the road segments connected by the selected nodes.

A_{s u b} \in R^{|E_{s u b}| \times |E_{s u b}|}

is the adjacency matrix of the constructed subgraph, where

|E_{s u b}|

denotes the cardinality of edge set of E. In details,

A_{s u b} (i, j) = 1 / d_{i j}

represents the weight between the selected nodes

v_{i}

to

v_{j}

along the driving direction with the road section length

d_{i j}

.

2.2.2. GCN Modular

Convolutional neural network (CNN) is a feedforward neural network containing convolution operation. In the past few years, CNN has made great contributions to image processing, which can deal with the Euclidean space data of grid structure well. However, it is difficult to achieve the considerable performance for the irregular data in non-Euclidean space in a complex traffic network. Therefore, in order to deal with the traffic data of non-Euclidean space, a graph convolution network (GCN) is presented in [32], which is a convolution operation of graph structure in non-Euclidean space. It can aggregate neighborhood information to update itself to obtain the deep feature representation of nodes.

By feeding the reconfigured graph structure with

G_{s u b}

into GCN, the spatial dependence of traffic flow could be extracted effectively. For the topology structure in non-Euclidean space of the reconfigured graph

G_{s u b}

, a multilayer graph convolution operation is expressed as:

H^{(l + 1)} = σ ({\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}} H^{(l)} W^{(l)}),

(7)

where

\tilde{A} = A_{s u b} + I_{N}

represents that the adjacency matrix

A_{s u b}

of reconfigured graph

G_{s u b}

plus its identity matrix

I_{N}

; D is the degree matrix of graph

G_{s u b}

;

H^{(l)}

is the node representation of each layer,

W^{(l)}

is the parameter of l layer, and

σ (\cdot)

is a nonlinear activation function. Then the obtained spatial variables

H^{(L)}

for the underlying reconfigured graph

G_{s u b}

will be obtained and fed to the Seq2Seq modular.

2.2.3. Seq2Seq Modular

Seq2Seq modular is proposed based on the encoder–decoder architecture in the EnGS-DGR model to extract temporal characteristics. The encoder and decoder in the Seq2Seq modular generally adopt the recurrent neural network (RNN), the gated recurrent unit network (GRU), and the long short term memory network (LSTM).

In the problem of traffic flow forecasting, the output of the current moment is not only related to the previous state, but also related perhaps to the future state. Therefore, we choose bidirectional LSTM (BiLSTM) network as encoder. BiLSTM is a variant of LSTM [33], in which two separated LSTM are employed to process input sequences from both directions, then the past and future contexts could be fruitfully exploited. The input of LSTM is a variable-length time series

I N = {i n_{1}, i n_{2}, \dots, i n_{n}}

, where

i n_{i} \in R^{d_{m}}

and

d_{m}

represent the number of features in each time index i, n is the input LSTM time series length,

i n_{i}

is the combination of the output

{H^{(L)}}_{i}

of GCN at moment i and the traffic characteristic

T r_{i}

of the target node at moment i,

i n_{i} = [{H^{(L)}}_{i}; T r_{i}]

. The updated status of the hidden vector at time t is as follows:

\begin{matrix} i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + b_{i}), \\ f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + b_{f}), \\ c_{t} = f_{t} \otimes c_{t - 1} + i_{t} \otimes tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c}), \\ o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + b_{o}), \\ h_{t} = o_{t} \otimes tanh (c_{t}), \end{matrix}

(8)

where

i_{t}

,

f_{t}

and

o_{t}

represent input gate, forgetting gate and output gate at time t, respectively;

c_{t}

,

σ

and ⊗ represent cell state vector at time t, the sigmoid function and the element multiplication, respectively.

By employing two LSTM in two directions, BiLSTM can read the input sequence of traffic flow from the forward and backward directions and the output has hidden states in two directions, the forward state is

\vec{h} = \{\vec{h_{1}}, \vec{h_{2}}, \dots, \vec{h_{n}}\}

, and the backward state denotes

\overset{\leftarrow}{h} = \{\overset{\leftarrow}{h_{1}}, \overset{\leftarrow}{h_{2}}, \dots, \overset{\leftarrow}{h_{n}}\}

, where

\vec{h_{i}}

and

\overset{\leftarrow}{h_{i}}

represent the hidden vectors of forward and reverse LSTM outputs, respectively. The hidden vector at the moment i can be expressed as:

h_{i} = [\vec{h_{i}}; \overset{\leftarrow}{h_{i}}] .

(9)

Decoder with the global attention mechanism generates the traffic flow output sequence of EnGS-DGR model

y = \{y_{1}, y_{2}, \dots, y_{T}\}

, where T is the length of the forecasting sequence. Because the traffic flow forecasting requires outputs with continuous values, a fully connected layer with linear activation function is used to generate continuous outputs. The calculation formula for traffic flow output

y_{t}

at time t in decoder is described as:

\begin{matrix} y_{t} = L i n e a r (W [s_{t}; c_{t}] + b), \\ s_{t} = L S T M (y_{t - 1}, s_{t - 1}, c_{t}), \end{matrix}

(10)

where

s_{t}

represents the hidden state of the decoder at time t;

c_{t}

represents the attention context vector; W and b are the trained parameters. Moreover, the attention context vector

c_{t}

is expressed as the weighted sum of the hidden state

h_{i}

of the encoder at time t, which is given by:

c_{t} = \sum_{i = 1}^{n} α_{t i} h_{i},

(11)

where the weight

α_{t i}

of hidden state

h_{i}

is described as:

α_{t i} = soft max (e_{t i}),

(12)

in which

e_{t i} = a (s_{t - 1}, h_{i})

represents the correlation between the hidden state around

h_{i}

and the output at time t;

s o f t max

function ensures that the attention weight is normalized to one.

2.2.4. Loss Function

In order to avoid the overfitting, the mean square error (MSE) with

L 2

regularization term is employed as the loss function, which is described as:

l o s s = \frac{1}{τ} \sum_{i = 1}^{τ} {(Y_{i} - {\hat{Y}}_{i})}^{2} + λ L_{r e g},

(13)

where

τ

is the number of forecasted sequence time slices;

y_{r e a l (i)}

is the true value;

{y_{p r e d}}_{(i)}

denotes the forecasted value;

L_{r e g}

represents the regularization term;

λ

is the hyper parameter.

3. Experiments

3.1. Data Description

The efficacy of the proposed EnGS-DGR model is validated through two real-world traffic datasets:

(1): PEMS-BAY dataset collected by California transportation department performance measurement system (PEMS). This dataset contains the data of the bay area from 1 January 2017 to 31 May 2017, including 325 monitor nodes.
(2): METR-LA dataset collected by Los Angeles County Highway ring detector. This dataset contains data from 1 March 2012 to 30 June 2012, including 207 monitor nodes.

The monitor distribution in PEMS-BAY and METR-LA dataset are displayed in Figure 6. Meanwhile, 80% of traffic data is used as the training set and 20% as the test set.

3.2. Evaluation Performance Metrics

Three evaluation performance metrics are employed to evaluate the effectiveness of proposed EnGS-DGR model, including the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE).

(1): RMSE values reflect the deviation between observed value and true value, which is described as:

$R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(Y_{t} - {\hat{Y}}_{t})}^{2}} .$

(14)
(2): MAE values reflect the actual situation of the forecast value error, which is given by:

$M A E = \frac{1}{n} \sum_{i = 1}^{n} |Y_{t} - {\hat{Y}}_{t}| .$

(15)
(3): MAPE values measure the accuracy of forecasting, which is described as:

$M A P E = \frac{100}{n} \sum_{i = 1}^{n} |\frac{Y_{t} - {\hat{Y}}_{t}}{Y_{t}}| % .$

(16)

3.3. Experimental Results and Discussion

To verify the performance of the proposed EnGS-DGR model, eight classical baselines are employed, including the history average model (HA), the autoregressive integrated moving average model (ARIMA), the vector auto-regression model (VAR), the support vector regression model (SVR), the feedforward neural network (FNN), the recurrent neural network with fully connected LSTM hidden units (FC-LSTM), the spatial-temporal graph convolution network (STGCN), and the diffusion convolutional recurrent neural network (DCRNN).

Considering the proposed EnGS-DGR model and the employed eight baselines with three forecasting intervals of 15 min, 30 min, and 60 min, the comparison curves of 24-h forecasting results and forecasting errors are shown in Figure 7 and Figure 8, in which the horizontal axis is separated by a 5 min time slice and the vertical axis represents the travel speeds of the forecasting results and ground truth. The most of forecasting errors between the proposed results of EnGS-DGR model and ground truth are constrained in the range of

[- 2, 2]

. It has been obviously found that the proposed EnGS-DGR model not only obtains good approximations of the ground truth but also tracks the velocity trend during different time frames.

Considering the three evaluation performance metrics, the specific forecasting results are presented in Table 1 under the proposed EnGS-DGR model and the employed eight baselines. Taking 30min forecasting on PEMS-BAY dataset as an example, for this comparative study, the detailed comparisons could be analyzed as follows:

(1): In comparison with RMSE values obtained from HA, ARIMA, SVR, the decrement rates of the proposed EnGS-DGR model are $61.5 %$ , $19.7 %$ , and $26.3 %$ , respectively. Meanwhile, focused on MAE, the decrement rates of the proposed EnGS-DGR model are $41.3 %$ , $27.5 %$ , and $31.9 %$ compared with those under HA, ARIMA, and SVR. The above comparison results show that the proposed EnGS-DGR model could be answerable for complex nonlinear traffic data series than employed three baselines effectively.
(2): Compared with two GCN-based baselines of STGCN and DCRNN, the RMSE values under the EnGS-DGR model are reduced about $10.5 %$ and $3.8 %$ . In comparison with MAE values, the decrement rates of the proposed EnGS-DGR model are $6.6 %$ and $2.9 %$ compared with GCN and DCRNN, respectively. It can be seen that the proposed EnGS-DGR model performs better in forecasting accuracy and can capture the spatial-temporal correlation characteristics of traffic flow.

After that, the effectiveness of dynamic graph reconfiguration algorithm will be discussed.

First, the original graphs in PEMS-BAY dataset and METR-LA dataset consist of 325 and 207 traffic nodes, respectively, while employing the designed DGR algorithm. The numbers of traffic nodes in the reconfigured graph under different forecasting intervals are shown in Table 2. It can be seen that the number of traffic nodes are significant decreased under different forecasting intervals. Meanwhile, from Figure 6 and Figure 7 and Table 2, it could be seen that the proposed EnGS-DGR performs better than other forecasting models with both the short-term and the long-term forecasting intervals. It is particularly necessary to point out that the number of traffic nodes is reduced remarkably with both the short-term and the long-term forecasting intervals.

Next, by combining the designed DGR algorithm with FC-LSTM, STGCN, and DCRNN algorithms, the corresponding models are names as DGR-FS-LSTM, DGR-STGCN, and DGR-DCRNN, respectively, while predefined the forecasting interval as 30 min, the performance indices of traffic forecasting are displayed in Table 3 under DGR-FS-LSTM, DGR-STGCN, and DGR-DCRNN. It is obvious that the forecasting performances under models combined with DGR algorithm are better than those without DGR algorithm. That means, on the premise of not reducing the forecasting accuracy, the dynamic graph reconfiguration algorithm can reduce the amount of input data under the dynamic traffic flow effectively.

From the above simulation results and analyses, with relatively few traffic nodes, the proposed EnGS-DGR model outperforms the state-of-the-art traffic forecasting models with both the short-term and the long-term forecasting intervals.

4. Conclusions and Future Work

This paper presents a new deep learning model (EnGS-DRG) to improve the forecasting accuracy of traffic flow forecasting on road networks. By employing the multi-view clustering theory to quantize the correlation degree among traffic nodes, a dynamic graph reconfiguration algorithm was designed to reduce the irrelevant spatio-temporal traffic nodes. After that, the GCN and Seq2Seq were subtly merged into a spatio-temporal learning module. The advantages of the proposed EnGS-DRG model were validated by comparing two real traffic datasets with eight traffic forecasting baseline methods at 15, 30, and 60 min forecasting intervals.

However, this paper still has its limitations: (a) The forecasting model only considers traffic data and does not take into account external factors affecting the traffic flow. (b) As the traffic data are not sufficient, manual data interpolation is still required. (c) For a single traffic flow forecasting model, it is difficult to balance the forecasting errors. In the future, the occasion of dynamic graph reconfiguration could be defined as an event-driven approach involving traffic factors such as traffic congestion and accidents. Then, a data interpolation module will be supplied to reduce the data interpolation work in data preprocessing. Meanwhile, the deep learning models that utilize the ensembling of different GCN-based traffic forecasting methods will be discussed to capture the spatio-temporal features for either short-term traffic forecasting or long-term traffic forecasting.

Author Contributions

Conceptualization, S.-Y.H. and J.Z.; methodology, S.-Y.H. and Q.Z.; validation, Q.-W.S. and Q.Z.; writing—original draft preparation, Q.Z.; writing—review and editing, S.-Y.H. and J.Z.; supervision, Y.-H.C.; project administration, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grants 61903156 and 61873324, the Natural Science Foundation of Shandong Province for Key Project under Grant ZR2020KF006, the Natural Science Foundation of Shandong Province under Grant ZR2019MF040, the University Innovation Team Project of Jinan under Grant 2019GXRC015, the Higher Educational Science and Technology Program of Jinan City under Grant 2020GXRC057, and the State Scholarship Fund of the China Scholarship Council.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data for this study are openly available at https://github.com/liyaguang/DCRNN, accessed on 1 February 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Boukerche, A.; Tao, Y.; Sun, P. Artificial intelligence-based vehicular traffic flow prediction methods for supporting intelligent transportation systems. Comput. Netw. 2020, 182, 107484. [Google Scholar] [CrossRef]
Sun, P.; Boukerche, A. Assisted data dissemination methods for supporting intelligent transportation systems. Internet Technol. Lett. 2021, 4, e169. [Google Scholar] [CrossRef]
Boukerche, A.; Wang, J. Machine Learning-based traffic prediction models for Intelligent Transportation Systems. Comput. Netw. 2020, 181, 107530. [Google Scholar] [CrossRef]
Yuan, J.; Zheng, Y.; Xie, X.; Sun, G. Driving with knowledge from the physical world. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 316–324. [Google Scholar]
Li, K.; Zhai, C.; Xu, J. Short-term traffic flow prediction using a methodology based on ARIMA and RBF-ANN. In Proceedings of the 2017 Chinese Automation Congress (CAC), Jinan, China, 20–22 October 2017; pp. 2804–2807. [Google Scholar]
Hamed, M.; Al-Masaeid, H.; Said, Z. Short-term prediction of traffic volume in urban arterials. J. Transp. Eng. 1995, 121, 249–254. [Google Scholar] [CrossRef]
Kumar, S. Traffic flow prediction using Kalman filtering technique. Procedia Eng. 2017, 187, 582–587. [Google Scholar] [CrossRef]
Okutani, I.; Stephanedes, Y.J. Dynamic prediction of traffic volume through Kalman filtering theory. Transp. Res. Part B Methodol. 1987, 18, 1–11. [Google Scholar] [CrossRef]
Kim, C.; Hobeika, A.G. A short-term demand forecasting model from real-time traffic data. Infrastruct. Plan. Manag. 1993, 540–550. [Google Scholar]
Dissanayake, B.; Hemachandra, O.; Lakshitha, N.; Haputhanthri, D.; Wijayasiri, A. A comparison of ARIMAX, VAR and LSTM on multivariate short-term traffic volume forecasting. In Proceedings of the Conference of Open Innovations Association, FRUCT. FRUCT Oy, Oulu, Finland, 27–29 October 2021; pp. 564–570. [Google Scholar]
Yang, Z.; Wang, Y.; Guan, Q. Short-term traffic flow prediction method based on SVM. J. Jilin Univ. 2006, 6, 9. [Google Scholar]
Mingheng, Z.; Yaobao, Z.; Ganglong, H.; Gang, C. Accurate multisteps traffic flow prediction based on SVM. Math. Probl. Eng. 2013. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Liu, Q.; Yang, W.; Wei, N.; Dong, D. An improved k-nearest neighbor model for short-term traffic flow prediction. Procedia-Soc. Behav. Sci. 2013, 96, 653–662. [Google Scholar] [CrossRef] [Green Version]
Bernaś, M.; Płaczek, B.; Porwik, P.; Pamuła, T. Segmentation of vehicle detector data for improved k-nearest neighbours-based traffic flow prediction. IET Intell. Transp. Syst. 2014, 9, 264–274. [Google Scholar] [CrossRef]
Van Lint, J.W.C.; Van Hinsbergen, C. Short-term traffic and travel time prediction models. Artif. Intell. Appl. Crit. Transp. Issues 2012, 22, 22–41. [Google Scholar]
Hong, W.C.; Dong, Y.; Zheng, F.; Lai, C.Y. Forecasting urban traffic flow by SVR with continuous ACO. Appl. Math. Model. 2011, 35, 1282–1291. [Google Scholar] [CrossRef]
Suthaharan, S. Machine learning models and algorithms for big data classification. Integr. Ser. Inf. Syst. 2016, 36, 1–12. [Google Scholar]
Messai, N.; Thomas, P.; Lefebvre, D.; El Moudni, A. A neural network approach for freeway traffic flow prediction. Proc. Int. Conf. Control. Appl. 2002, 2, 984–989. [Google Scholar]
Chen, X.; Lu, J.; Zhao, J.; Qu, Z.; Yang, Y.; Xian, J. Traffic flow prediction at varied time scales via ensemble empirical mode decomposition and artificial neural network. Sustainability 2020, 12, 3678. [Google Scholar] [CrossRef]
Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU neural network methods for traffic flow prediction. In Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 324–328. [Google Scholar]
Yang, D.; Li, S.; Peng, Z.; Wang, P.; Wang, J.; Yang, H. MF-CNN: Traffic flow prediction using convolutional neural network and multi-features fusion. IEICE Trans. Inf. Syst. 2019, 102, 1526–1536. [Google Scholar] [CrossRef] [Green Version]
Yao, H.; Tang, X.; Wei, H.; Zheng, G.; Li, Z. Revisiting spatial-temporal similarity: A deep learning framework for traffic prediction. Proc. AAAI Conf. Artif. Intell. 2019, 33, 5668–5675. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv 2017, arXiv:1707.01926. [Google Scholar]
Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-gcn: A temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3848–3858. [Google Scholar] [CrossRef] [Green Version]
Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv 2017, arXiv:1709.04875. [Google Scholar]
Shang, P.; Liu, X.; Yu, C.; Yan, G.; Xiang, Q.; Mi, X. A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network. Digit. Signal Process. 2022, 103419. [Google Scholar] [CrossRef]
Jin, G.; Sha, H.; Feng, Y.; Cheng, Q.; Huang, J. GSEN: An ensemble deep learning benchmark model for urban hotspots spatiotemporal prediction. Neurocomputing 2021, 455, 353–367. [Google Scholar] [CrossRef]
Hou, F.; Zhang, Y.; Fu, X.; Jiao, L.; Zheng, W. The Prediction of Multistep Traffic Flow Based on AST-GCN-LSTM. J. Adv. Transp. 2021. [Google Scholar] [CrossRef]
Chitalia, G.; Pipattanasomporn, M.; Garg, V.; Rahman, S. Robust short-term electrical load forecasting framework for commercial buildings using deep recurrent neural networks. Appl. Energy 2020, 278, 115410. [Google Scholar] [CrossRef]
Dikshit, A.; Pradhan, B.; Santosh, M. Artificial neural networks in drought prediction in the 21st century–A scientometric analysis. Appl. Soft Comput. 2022, 114, 108080. [Google Scholar] [CrossRef]
Zhang, C.; Fu, H.; Liu, S.; Liu, G.; Cao, X. Low-rank tensor constrained multiview subspace clustering. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 1582–1590. [Google Scholar]
Ni, T.; Wang, L.; Zhang, P.; Wang, B.; Li, W. Daily tourist flow forecasting using SPCA and CNNSTM neural network. Concurr. Comput. Pract. Exp. 2021, 33, e5980. [Google Scholar] [CrossRef]
Li, M.; Miao, Z.; Xu, W. A CRNN-based Attention-seq2seq Model with Fusion Feature for Automatic Labanotation Generation. Neurocomputing 2021, 454, 430–440. [Google Scholar] [CrossRef]

Figure 1. Spatial-temporal correlation of traffic flow.

Figure 2. The definition representation of traffic flow forecasting problem.

Figure 3. The process of traffic flow forecasting.

Figure 4. The framework of EnGS-DGR model.

Figure 5. The flow of low-rank tensor-constrained multiview subspace representation algorithm.

Figure 6. Monitor distribution in PEMS-BAY and METR-LA dataset [23].

Figure 7. Visualization of forecast results for METR-LA dataset. (a) The comparison curves under 15 min forecasting interval. (b) The comparison curves under 30 min forecasting interval. (c) The comparison curves under 60 min forecasting interval. (d) The curves of forecasting errors.

Figure 8. Forecasting results for PEMS-BAY dataset. (a) The comparison curves under 15 min forecasting interval. (b) The comparison curves under 30 min forecasting interval. (c) The comparison curves under 60 min forecasting interval. (d) The curves of forecasting errors.

Table 1. The comparisons of evaluation indicators under different forecasting models.

Dataset	Intervals	Metrics	HA	ARIMA	VAR	SVR	FNN	FC-LSTM	STGCN	DCRNN	EnGS-DGR
PEMS-BAY	15 min	MAE	2.88	1.62	1.74	1.85	2.20	2.05	1.36	1.38	1.39
		RMSE	5.59	3.30	3.16	3.59	4.42	4.19	2.96	2.95	2.92
		MAPE	6.82%	3.54%	3.66%	3.88%	5.19%	4.81%	2.90%	2.91%	2.88%
	30 min	MAE	2.88	2.33	2.32	2.48	2.30	2.20	1.81	1.74	1.69
		RMSE	5.59	4.76	4.25	5.18	4.63	4.55	4.27	3.97	3.82
		MAPE	6.82%	5.41%	5.05%	5.52%	5.43%	5.23%	4.17%	3.95%	3.80%
	60 min	MAE	2.88	3.38	2.93	3.28	2.46	2.37	2.49	2.07	1.95
		RMSE	5.59	6.50	5.44	7.08	4.98	4.96	5.16	4.74	4.61
		MAPE	6.82%	8.33%	6.50%	8.07%	5.89%	5.71%	5.96%	4.92%	4.70%
METR-LA	15 min	MAE	4.16	3.99	4.42	3.99	3.99	3.44	3.60	2.77	2.70
		RMSE	7.80	8.21	7.89	8.45	7.94	6.30	6.56	5.38	5.22
		MAPE	13.01%	9.61%	10.22%	9.30%	9.93%	9.66%	9.83%	7.33%	7.01%
	30 min	MAE	4.16	5.15	5.41	5.05	4.23	3.77	4.01	3.15	3.01
		RMSE	7.80	10.45	9.13	10.87	8.17	7.23	7.66	6.45	6.25
		MAPE	13.01%	12.72%	12.71%	12.15%	12.90%	10.92%	11.25%	8.84%	8.16%
	60 min	MAE	4.16	6.90	6.90	6.72	4.49	4.37	4.51	3.60	3.49
		RMSE	7.80	13.23	13.23	13.76	8.69	8.69	8.87	7.59	7.36
		MAPE	13.01%	17.44%	17.46%	16.77%	14.03%	13.29%	13.56%	10.56%	10.24%

Table 2. The comparisons of number of nodes under DGR algorithm.

Dataset	Intervals	Number of Traffic Nodes	Decrement Rates
PEMS-BAY	15 min	46	85.85%
	30 min	102	68.62%
	60 min	225	30.77%
METR-LA	15 min	38	81.64%
	30 min	84	59.42%
	60 min	169	18.36%

Table 3. Performance indices under different models combined with DGR algorithm.

Algorithm	MAE	RMSE	MAPE
DGR-FC-LSTMDGR	2.15	4.49	5.21%
FC-LSTM	2.20	4.55	5.23%
DGR-STGCN	1.79	4.22	4.13%
STGCN	1.81	4.27	4.17%
DGR-DCRNN	1.70	3.91	3.90%
DCRNN	1.74	3.97	3.95%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Han, S.-Y.; Zhao, Q.; Sun, Q.-W.; Zhou, J.; Chen, Y.-H. EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration. Appl. Sci. 2022, 12, 2890. https://doi.org/10.3390/app12062890

AMA Style

Han S-Y, Zhao Q, Sun Q-W, Zhou J, Chen Y-H. EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration. Applied Sciences. 2022; 12(6):2890. https://doi.org/10.3390/app12062890

Chicago/Turabian Style

Han, Shi-Yuan, Qiang Zhao, Qi-Wei Sun, Jin Zhou, and Yue-Hui Chen. 2022. "EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration" Applied Sciences 12, no. 6: 2890. https://doi.org/10.3390/app12062890

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EnGS-DGR: Traffic Flow Forecasting with Indefinite Forecasting Interval by Ensemble GCN, Seq2Seq, and Dynamic Graph Reconfiguration

Abstract

1. Introduction

2. Methodology

2.1. Problem Definition

2.2. EnGS-DGR Model

2.2.1. Dynamic Graph Reconfiguration Algorithm

2.2.2. GCN Modular

2.2.3. Seq2Seq Modular

2.2.4. Loss Function

3. Experiments

3.1. Data Description

3.2. Evaluation Performance Metrics

3.3. Experimental Results and Discussion

4. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI