1. Introduction
Rapid economic development has motivated a sharp increase in traffic demand and, thus, led to various traffic problems (e.g., traffic congestion, air pollution, and traffic accidents). Intelligent transportation systems (ITS) that support traffic control and management are considered as one of the efficient techniques for alleviating such problems. Traffic flow prediction provides critical traffic state information for the ITS system, which helps traffic participants make better traveling decisions and enhance traffic operation efficiency [
1,
2,
3,
4]. More specifically, we can develop better traffic control strategies (e.g., adaptive traffic signal control, dynamic speed-limit setting, etc.) and fine-tune more appropriate traffic parameters that consider roadway traffic condition fluctuation interference with the help of traffic flow data. In this manner, the successfulness of traffic control strategy is highly depended on the resolution of traffic flow prediction data [
5]. Lane-level traffic flow data is more sensitive to microscopic traffic state estimation accuracy (such as traffic speed, volume, and occupancy) and, thus, has become a hot topic in the traffic community [
6,
7,
8].
Traffic flow prediction can be roughly divided into short-term and long-term levels in terms of the time span. More specifically, long-term traffic flow prediction aims to provide traffic flow data for several hours in advance, while short-term traffic flow prediction is implemented on the minute level (which is our research focus) [
9,
10]. Previous studies suggest that linear models, nonlinear models, and hybrid models are three types of typical traffic flow prediction techniques [
11]. Linear models employ mathematical methods to conduct traffic flow prediction tasks. For instance, the Autoregressive Integrated Moving Average Model (ARIMA) showed a satisfactory performance on long-term traffic flow prediction tasks [
12,
13]. Nonlinear models introduce relevant machine learning methods to tackle the traffic flow prediction challenge. For instance, the relevant neural network models have shown success in many traffic flow prediction applications. Yasdi et al. employed an artificial neural network (ANN) to forecast traffic flow on a single roadway segment [
14]. Yao et al. established a granular traffic flow forecasting model with an ANN model, and the experimental results indicated that the proposed model outperformed the other popular models (e.g., the Robertson model) [
15].
It is noted that nonlinear models have enjoyed huge success in tackling many traffic flow prediction tasks. However, the uncertainty characteristic of traffic flow data may impose a negative effect on the nonlinear prediction models, which can degrade a model’s performance. The hybrid models (i.e., combining both linear and nonlinear models) are proposed to overcome the disadvantages. Hou et al. proposed a novel long-term traffic flow forecasting method, which was verified with real-time traffic data [
16]. Jonathan et al. verified the hierarchical temporal memory (HTM) model’s performance when conducting a short-term traffic flow prediction task over real-world traffic data from Sydney. Although the long short-term memory (LSTM) showed better performance than that of the HTM model, Jonathan believed that the HTM model is a potentially efficient method for short-term traffic flow prediction tasks [
17]. Zhao et al. proposed the temporal graph convolutional network (T-GCN) model for traffic forecasting based on the urban road network and found that the predictions outperformed state-of-the-art baselines on real-world traffic datasets [
18]. Similar research can be found in [
4,
19,
20,
21,
22,
23].
Initial traffic data collected from inductive loop detectors may involve unexpected outliers, and thus many research interests have been paid to enhance the data quality. More specifically, many data denoising models have been introduced to suppress noises before the implementation of traffic flow prediction tasks, and it has been found that the prediction performances of the model combined with a denoising algorithm are better than that of the model without a denoising process [
24], such as the wavelet Kalman filter model [
25,
26,
27], wavelet transform [
28,
29,
30]. Empirical mode decomposition (EMD) was first proposed to remove nonlinear noises from the initial data series [
31]. EMD extracts the intrinsic mode function (IMF) sets from the input data samples, which can be separated into high-frequency (HF) and low-frequency (LF) portions. The HF segments are considered as data details and noises embedded in the original traffic data, and the LF counterparts are the data’s contours. The EMD model is easily interfered with by the mode mixing challenge, which may severely degrade the EMD model’s performance when the IMF segments contain intermittence features. To address the issue, the ensemble empirical mode decomposition (EEMD) model was proposed by Wu et al. [
32]. The EEMD model discards the noisy IMFs and selects noise-free IMF samples to reconstruct smoothed data.
This study aims to propose a simple but efficient traffic flow prediction framework based on EEMD and an artificial neural network (ANN). More specifically, we introduced the EEMD model to cleanse the initial traffic flow data collected from neighboring loop detectors, which was installed on the same roadway lane. The ANN model was then employed to forecast short-term traffic flow data at different time scales. We verified the proposed framework performance in both data cleansing and prediction procedures. The findings of the research can provide accurate traffic flow data in advance, which benefits traffic authorities by enabling them to take more reasonable traffic management and control measurements to reduce traffic congestion and enhance traffic safety. The remainder of the paper is organized as follows:
Section 2 illustrates the proposed traffic flow data denoising model and prediction model in detail.
Section 3 describes the data source and specific experimental results.
Section 4 briefly concludes the research.
2. Methodology
2.1. Schematic Overview
Loop-detector-generated traffic data is crucial for traffic flow prediction accuracy, which is supposed to be smooth and noise free. However, unwanted factors (e.g., detector damage, roadway maintenance, etc.) may deteriorate the original data quality, which can further affect the traffic flow data prediction accuracy. In this way, traffic flow data samples provided by loop deductive detectors are composed of smooth (i.e., ground truth traffic flow data) and noisy samples (anomalous data). We firstly introduced the EEMD method to correct the raw traffic data, and then the ANN model was employed to predict traffic flow. The flowchart for the proposed framework is shown in
Figure 1.
2.2. EEMD Model for Denoising Raw Traffic Flow Data
The EEMD model has shown great success in traffic flow data denoising tasks due to its features that adaptively decompose both stationary and nonstationary data. Due to such advantages, the EEMD model significantly outperforms other data denoising models (e.g., wavelet filters, the short-time Fourier transform, moving average, etc. [
33,
34]). The EEMD method exploits the IMFs from the input raw traffic data with the shifting procedure, which is described in detail as follows:
- (a)
initialize all the parameters. Add white noise into the raw traffic data;
- (b)
recognize the local maximum and minimum values for the input traffic flow data series;
- (c)
connect the maximum value points to obtain the upper envelop, , and lower envelope, , in a similar manner;
- (d)
calculate the average envelope,
, with the upper and lower envelopes using Equation (1):
- (e)
obtain the data difference,
, between the raw traffic flow data,
, and the average envelope,
, using Equation (2):
- (f)
obtain the IMF element when the following two conditions are satisfied: the number difference between the extrema points and the zero-crossing samples is less than 1; the average value for the two points from the upper and lower envelopes should be zero. More specifically, the
data sample is considered as the IMF when the above two conditions are met. If not, the upper and lower envelops of
are employed to conduct the kth round for the purpose of searching for the IMF with the following rule:
where
is the
(k = 1, 2, …, s) round of mean envelope for the traffic flow data, and the parameter
is the potential IMF for the
(k = 1, 2, …, s) round iteration procedure.
The procedure stops when the updated
meets the previously mentioned two conditions. In other words, the obtained
is considered as the IMF (denoted as
). Then, the difference between the initial traffic data
and the
is calculated (see Equation (4)), which are further processed to exploit the remaining IMFs. Note that the
plays the role of dyadic filter bank, indicating that the residual traffic flow data,
, contained a longer traffic flow variation tendency.
The EEMD shifting procedure is iteratively implemented unless the residual traffic flow data,
, is a monotonic data series, or only one extrema value is found. The relationship between raw traffic data
, decomposed IMFs
(i = 1, 2 … m), and residual data is established as follows:
where
is the number for the IMF set and
is the residual data.
When the EEMD model finished the data decomposition procedure, we could determine the final data denoising result by averaging the IMFs (see Equation (6)). It is noted that the added white noise was suppressed when calculating the mean value from IMF segments.
where parameter
is the ensemble number and
is the residual traffic data.
2.3. Traffic Flow Prediction with ANN Model
The artificial neural network model aims to learn intrinsic nonlinear relationships between the input and output traffic flow data, which follows an information perception rule by human being. The ANN model presents a high performance on incomplete associative memory, pattern recognition, and other similar tasks. Moreover, it demonstrates the efficacy in dealing with the nondeterministic polynomial-time hardness problems, for which it is difficult to build a model. Considering the traffic flow data features, we employed the back-propagation (BP) neural network (a type of feed-forward ANN model) to fulfill the traffic data prediction task.
The input layer of the BP network receives the raw traffic flow data sequence, which are further processed in hidden layers. Note that the input traffic data points are mapped into hidden nodes with different weights. More specifically, the input connection weights and thresholds (which is defined as bias) serve as the BP network input. After that, the hidden layers learn the intrinsic traffic flow data patterns in an iterative manner. By successfully exploring the intrinsic features, the hidden node outputs can be obtained by applying a transfer function (the sigmoidal function was used in our study). During the network training procedure, the BP network quantifies the difference (i.e., error) between the prediction outputs and ground truth traffic flow data. The BP network fine-tunes the network (both the structure and parameter setting) according the back-propagated errors, and such a procedure stops when the default stopping criteria is met. The BP network structure used in our study is shown in
Figure 2.
Suppose the neuron number in the BP network input layer, hidden layer, and the output layer are g, p, and q. The parameter (r = 1, 2, …, g; z = 1, 2, …, p) is the weight between neurons connecting the input and hidden layers, and (r = 1, 2, …, g; z = 1, 2, …, q) is the weight linking the hidden and output layers. The thresholds for outputting the network-learned traffic flow data pattern are (r = 1, 2, …, p) and (r = 1, 2, …, q), respectively. In our study, the ANN employed the BP network as the default model when no further specifications were provided.
2.4. Prediction Goodness Measurements
To verify the proposed framework performance, we compared the predicted traffic flow data with ground truth data by statistical measurements. Three goodness measurement indicators were introduced to quantify the model performance, which were the mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RSME). For predicting any given traffic flow data series, we obtained the above three indicators with Equations (7)–(9). Note that a smaller MAE, MAPE, or RMSE indicated higher prediction accuracy, while larger values indicted that the model prediction performance may not have been satisfied.
where
predicted the traffic flow data samples and
is the ground truth data.
3. Experiment
Traffic flow prediction is crucial for traffic control and management, and accurate traffic flow prediction can significantly benefit roadway safety and traffic efficiency. To this aim, we quantitatively evaluated our proposed model (i.e., a combination of EEMD and an ANN model, abbreviated as EEMD+ANN) performance at different time spans ahead. More specifically, we predicted traffic flow at varied steps ahead based on traffic flow data at different time scales. For instance, traffic flow prediction on the 3-step ahead, based on 2 min data (the time scale was 2 min), was to predict traffic flow at 6 min time ahead based on the 2 min data. The rule was applicable to 1-step, 6-step, and 10-step traffic flow prediction under other time scales (1 and 10 min). For the purpose of model performance comparison, we implemented the combination of EMD and an ANN model (abbreviated as EMD+ANN) and a conventional ANN model to predict the traffic flow. Note that 70% of traffic flow data was employed to train the artificial neural network, and 10% of data samples were used for the purpose of network validation and parameter fine-tuning. The remaining 20% of data were used for evaluating the performance of the traffic flow prediction model. Both the input and feedback delays were selected from 1 to 2, and the hidden layer size was 10.
3.1. Data
We collected the traffic flow data from loop detectors installed in freeways in Minnesota State, USA, and data access was supported by the Minnesota Department of Transportation and the Transportation Data Research Laboratory at the University of Minnesota Duluth [
35]. The original publicly accessible data included traffic volume, speed, occupancy, and we only recorded the traffic flow data from three neighboring detectors installed on the same freeway lane. Besides, we collected the traffic flow data during the time period from 1 January 2016 to 15 January 2016. Note that the time resolution for the raw data was 30 s, and we have aggregated the data into 1, 2, and 10 min for the traffic flow prediction task. More details about the experiment are shown in
Table 1.
3.2. Traffic Flow Prediction Results Analysis
3.2.1. Parameter Settings
Added white noise,
, and the ensemble number,
, are closely related to the EEMD denoising performance, and thus we firstly describe the two parameters’ determination procedures in detail. The relationship between the ensemble number,
, and added white noise,
, is shown in Equation (10), where
demonstrates the EEMD denoising performance on the input traffic flow data. Note that a larger
indicates superior data cleansing performance and vice versa. However, an obvious weakness is that more computation cost (e.g., computation resources, longer computation time, etc.) is required to obtain better denoising results (i.e., a larger
). Previous studies have suggested that a default
and
of 0.2 and 1000 can obtain satisfactory performance in many denoising applications [
31].
In our study, we applied the EEMD model to smooth the 1 min traffic flow data for determining the optimal parameter settings in the EEMD model. More specifically, we obtained the spectrum of the EEMD smoothed data by applying different settings to the two parameters, which were further analyzed for the purpose of parameter determination. The specific parameter settings were set as follows: (1)
= 0.1 and
= 500, (2)
= 0.2 and
= 1000, (3)
= 0.3 and
= 1500, and (4)
= 0.4 and
= 2000. For the purpose of simplicity, we only present the first four IMF spectrums, which clearly show a difference in the comparison with the remaining IMF counterparts. The spectrogram distributions of 1 min traffic data for the first four IMFs (denoted as IMF1, IMF2, IMF3, and IMF4) for detector ID # 5802 are shown in
Figure 3, while
Figure 4 and
Figure 5 are IMF spectrogram distributions for detector ID # 5805 and # 5808, respectively.
As shown in
Figure 3, the spectrum features were very similar under four groups of different parameter settings in the EEMD model. Note that the larger intensity (i.e., close to the red area) in the color bar in
Figure 3 (which is applicable to
Figure 4 and
Figure 5) indicated high-frequency components and vice versa. Note that the four IMFs were the details of the traffic flow data, and thus the IMF spectrograms that contain more high-frequency components were considered to have better smoothing results. IMF4 and IMF3 in
Figure 3a,b, respectively, were quite close to low-frequency area, which indicated that the EEMD models may over-suppress noises in the raw traffic flow data. Although the spectrograms of IMF1, IMF2, and IMF4 were quite similar, we observed that the IMF3 spectrogram in
Figure 3d contained more high-frequency components (the color of the spectrogram is closer to the red area) compared to the IMF3 spectrogram in
Figure 3c. Note that the above spectrogram distribution analyses were applicable to
Figure 4 and
Figure 5 (i.e., detector ID #5805 and #5808). Based on the above analysis, we set the default values of
and
as 0.4 and 2000, respectively.
3.2.2. Traffic Flow Smoothing Results
Traffic volume is supposed to be smooth considering that the vehicles passing through a detector area gradually increase (or decrease) without significant vehicle speed (or traffic volume) variation at a given time span. In that manner, we believe that traffic volume data is composed of continuous and smooth components (i.e., noise-free traffic flow data) and noises. Moreover, the EEMD model can efficiently exploit the stochastic data outliers that exist in the traffic volume (which are identified as noise-IMFs). Based on the above analysis, we employed the EEMD model to minimize the data noise. We presented the EEMD smoothing results on the traffic flow data collected at loop detector ID #5802, and then verified the denoising performance on the traffic flow data from the other two detectors (i.e., detector ID #5805 and #5808, respectively).
Although the 1 min traffic flow dataset contained a larger number of samples, we could still observe that the EEMD method suppressed the obvious outliers in the raw traffic flow data, as burr samples were significantly reduced (see
Figure 6a). More specifically, the raw 1 min traffic data obtained from detector ID #5802 (see the black curve in
Figure 6a) showed sudden fluctuations in peak and trough areas. More specifically, the spikes, dips, and choppy samples could be periodically found at different time spans. Note that a few traffic flow volume samples were 30% larger than the neighboring data (with an extreme data sample reaching two-fold higher than its neighbors). Such data errors imposed a significant threat to the traffic flow prediction performance. It is noted that the traffic volume in larger time scales (e.g., 2 and 10 min) obtained less outliers compared to those of the 1 min time scale (see the
Figure 6b,c, respectively). The main reason was that traffic volume outliers in the larger time scales were significantly reduced by the data aggregation procedure.
The raw 2 and 10 min traffic volume data and the corresponding smoothed data for detector ID #5802 are shown in
Figure 6b,c. The EEMD smoothing results on the 2 and 10 min traffic data could be better observed (compared to the 1 min counterparts) due to the data samples being evenly dispersed in the Cartesian coordinate system. We noticed that the 10 min traffic data showed quite smooth results, which successfully suppressed the spikes, dips, and choppy samples. More specifically, the obvious abnormal oscillations and burr data (caused by the detector damages, etc.) in the 10 min traffic volume data series were corrected into more reasonable values. The EEMD smoothing results on the data from loop detector ID #5805 and #5808 (see
Figure 7 and
Figure 8, respectively) showed very similar results to those of detector ID #5802. From the perspective of quantitative analysis, we considered that the EEMD model successfully suppressed outliers in the raw traffic flow data.
We employed three evaluation metrics (i.e., median absolute deviation (MAD), mean square error (MSE), and Pearson correlation coefficient (PCC)) to qualitatively analyze the traffic flow data denoising performance, and the corresponding results are shown in
Table 2 [
31]. The MAD values for the three detectors under a 1 min time scale were 6.942 (for detector ID #5802), 6.921 (for detector ID #5805), and 6.554 (for detector ID #5808). The MAD distributions under 2 and 10 min time scales showed similar variation tendency to that of the 1 min time scale. In this manner, the MAD distributions for different detectors at the same time scale were quite close. Moreover, the distributions of the MSE and PCC indicators’ distributions confirmed that the EEMD smoothing results for different detectors under the same time scale were very similar.
Table 2 also shows that the aggregated traffic flow data at a larger time scale may lead to smoothing performance loss. More specifically, the MAD and MSE indicators for the 2 min time scale for the same detector were nearly two-fold compared to the counterparts of the 1 min time scale (e.g., the MAD and MSE for detector ID #5802 at the 2 min time scale were 13.665 and 14.540, respectively, while the counterparts at 1 min were 6.942 and 7.576). Note that the MAD and MSE at 10 min were approximately three-fold higher than those of the 2 min data, which were ten-fold larger in comparison with the 1 min data. Additionally, the PCC indicator variation at different time scales did not show an obvious difference and was bigger than 0.940 for each detector at each time scale. Based on the above qualitative and quantitative analyses, we considered that EEMD obtained a satisfactory smoothing performance at different time scales, and the smoothing accuracy was higher at smaller time scales (i.e., smoothing accuracy at 1 min > smoothing accuracy at 2 min > smoothing accuracy at 10 min).
3.2.3. Traffic Flow Prediction Analysis
The 1-step traffic flow prediction results are shown in
Table 3, and the 3-step, 6-step, and 10-step prediction results are presented from
Table 4,
Table 5 and
Table 6. We will describe the 1-step traffic flow prediction results in detail, considering page limitation. From the perspective of 1 min traffic flow data, the MAE values for the EEMD+ANN framework (detector IDs were #5802, #5805, and #5808) were 0.145, 0.148, and 0.139, which were approximately the same. The MAEs of EMD+ANN and traditional ANN models were both larger than those of the EEMD+ANN counterparts (from the same detector). For instance, the MAE values for EMD+ANN and ANN at 1 min for detector ID #5802 were 0.206 and 2.765, which were at least 50% higher than the EEMD+ANN MAE. The traffic flow prediction performance at 2 and 10 min were similar to the 1 min results.
The MAPE indicated the loss performance for the traffic flow prediction models, while the RMSE measured the deviation between the predicted and the ground truth data. The MAPE indicators for detector ID #5802 at the 1 min time scale for the three models (i.e., EEMD+ANN, EMD+ANN, and ANN) were 1.780, 2.903, and 34.450, which indicated that the ANN-obtained prediction error was significantly larger than the other two models. Moreover, the EEMD+ANN prediction error (in terms of MAPE) was approximately half that of EMD+ANN. The MAPE distributions at the 10 min scale showed similar variation to those of the 1 and 2 min scales (see
Table 3). To sum up, the MAPE distributions indicated that the EEMD+ANN model obtained minimal prediction loss (i.e., maximal prediction accuracy). Besides, the RMSE indicator variation in
Table 3 shows a similar performance to the MAE and MAPE statistics. Based on the above analysis, we considered that the EEEM+ANN model obtained a satisfactory performance on the 1-step traffic flow prediction task.
The 3-step traffic flow prediction results, which confirmed that the prediction errors (in terms of the MAE, MAPE, and RMSE) for the ANN model were larger than that of the EEMD+ANN model, are shown in
Table 4. Similarly, the 6-step and 10-step traffic flow prediction results shown in
Table 5 and
Table 6 demonstrated that the EEMD+ANN model prediction accuracy at different time scales (for the same detector data) were higher than those of the counterparts. Moreover, it is observed that the prediction accuracy showed a decreasing tendency along with the increase of the prediction step. For instance, the MAPEs for the EEMD+ANN model at the time scale of 1 min under 1-step, 3-step, 6-step, and 10-step were 1.780, 1.915, 9.188, and 10.046 (see the sixth row in
Table 3,
Table 4,
Table 5 and
Table 6). We can infer that traffic flow prediction at longer steps may be interfered with by unexpected factors (e.g., asymmetric volume distributions).
For the purpose of visualizing the prediction performance, we calculated the average MAE, MAPE, and RMSE values for the three detectors, which are shown in
Figure 9. From the perspective of 1 min average traffic flow prediction accuracy, the hybrid models (i.e., EEMD+ANN and EMD+ANN) obtained better performance than that of the ANN model. More specifically, the average MAE for the EEMD+ANN hybrid model at 1-step, 3-step, 6-step, and 10-step were all smaller than 1 (see the left uppermost subplot in
Figure 9), which were slightly lower than the counterparts of the EMD+ANN model. Moreover, the EEMD+ANN model outperformed the EMD+ANN model on the prediction task at different time scales. Such a phenomenon can be observed in the 2 and 10 min traffic flow prediction accuracy (see
Figure 9). It was noted that the ANN-obtained average MAEs were three times higher than those of the hybrid models. The main reason was that the denoising procedure eliminated noisy traffic flow data, which were substituted with reasonable data. In this manner, interference coming from abnormal oscillations in the raw data was suppressed during the traffic flow prediction procedure.
Figure 9 also indicated that predicting a longer time period may results in larger errors (i.e., a larger MAE, MAPE, and RMSE). Particularly, the MAE, MAPE, and RMSE showed an obvious increasing tendency when the prediction step became larger. More specifically, the statistical indicators’ variation on the prediction task at a larger time step demonstrated a prediction performance decrease. One of the potential reasons is that training samples under the longer time interval is insufficient, and thus the prediction models may fail to fully exploit the intrinsic traffic flow pattern. To sum up, the traffic flow prediction task on the 1 min data obtained better performance (in terms of aggregated MAE, MAPE, and RMSE) than that of the 2 and 10 min data.
4. Conclusions
It is not easy to predict accurate traffic flow data via historical information due to their nonlinear and unstable features. We proposed an ensemble framework with EEMD and an ANN model to prediction traffic flow data at different yet typical time scales (i.e., 1-step, 2-step, 6-step, and 10-step). The proposed EEMD model decomposed the raw traffic flow data into different IMFs, and the noisy IMFs were suppressed while the other IMFs were aggregated into the noise-free traffic flow data. After this, the ANN model was introduced to predict the traffic flow at 1-step, 3-step, 6-step, and 10-step ahead under different time scales (1, 2, and 10 min). The experimental results showed that hybrid models (i.e., EEMD+ANN and EMD+ANN) significantly outperform the conventional ANN model on the traffic flow prediction task. More specifically, the MAEs, MAPEs, and RMSEs obtained by the EEMD+ANN model at different time scales were significantly smaller than the counterparts of EMD+ANN and ANN. The proposed framework can be easily transferred to support other traffic data prediction tasks (speed, density, etc.) due to the generative features of both EEMD and the ANN model.
We can further expand our research in the following aspects: First, although we tested the performance of the proposed prediction model at different time scales, we can obtain a more holistic performance by comparing it with other traffic flow prediction models, such as ARIMA. Second, we only tested our model’s performance on a short-term traffic flow prediction task. It deserves further attention to test the model’s performance on a long-term traffic prediction task. Third, we can implement relevant deep learning methods to implement the short-term traffic flow prediction task, which may provide us with additional interesting results and findings. Last but not least, we can obtain more realistic and real-time traffic state information with accurate traffic flow prediction results, which can benefit transportation efficiency improvement on a more refined level.