A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed

Jiang, Shan; Feng, Yuming; Liao, Xiaofeng; Wu, Hongjuan; Liu, Jinkui; Onasanya, Babatunde Oluwaseun

doi:10.3390/sym16050537

Open AccessArticle

A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed

¹

College of Computer Science, Chongqing University, Chongqing 400044, China

²

School of Computer Science and Engineering, Chongqing Three Gorges University, Chongqing 404100, China

³

Key Laboratory of Intelligent Information Processing and Control, Chongqing Three Gorges University, Chongqing 404100, China

⁴

College of Mathematics and Statistics, Chongqing Three Gorges University, Chongqing 404100, China

⁵

Department of Mathematics, University of Ibadan, Ibadan 200005, Nigeria

^*

Authors to whom correspondence should be addressed.

Symmetry 2024, 16(5), 537; https://doi.org/10.3390/sym16050537

Submission received: 19 March 2024 / Revised: 20 April 2024 / Accepted: 26 April 2024 / Published: 30 April 2024

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate and fast traffic prediction is the data-based foundation for achieving traffic control and management, and the accuracy of prediction results will directly affect the effectiveness of traffic control and management. This paper proposes a new spatiotemporal periodic polynomial model for road traffic, which integrates the temporal, spatial, and periodic features of speed time series and can effectively handle the nonlinear mapping relationship from input to output. In terms of the model, we establish a road traffic speed prediction model based on polynomial regression. In terms of spatial feature extraction methods, we introduce a maximum mutual information coefficient spatial feature extraction method. In terms of periodic feature extraction methods, we introduce a periodic trend modeling method into the prediction of speed time series, and effective fusion is carried out. Four strategies are evaluated based on the Guangzhou road speed dataset: a univariate polynomial model, a spatiotemporal polynomial model, a periodic polynomial model, and a spatiotemporal periodic polynomial model. The test results show that the three methods proposed in this article can effectively improve prediction accuracy. Comparing the spatiotemporal periodic polynomial model with multiple machine learning models and deep learning models, the prediction accuracy is improved by 5.94% compared to the best feedforward neural network. The research in this article can effectively deal with the temporal, spatial, periodic, and nonlinear characteristics of speed prediction, and to a certain extent, improve the accuracy of speed prediction.

Keywords:

traffic speed; prediction; polynomial regression; period; spatial features

1. Introduction

The road system is an important infrastructural element of China’s economic and social systems. As of the end of 2020, the national road mileage had reached 5.19 million kilometers. Therefore, the road system bears a huge amount of traffic pressure throughout the country and plays an important role in the rapid development of China’s economy and residents’ lives [1]. According to the traffic analysis report of major cities in China in 2021 released by Gaode Map [2], with the rapid development of the economy and improvements in the social consumption levels of residents, the growth rate of car ownership remains high. According to the report [3], managing and alleviating traffic congestion in the road system, and strengthening the control capabilities of traffic management departments over roads in major cities, has become one of the key tasks in digital transportation construction.

An efficient traffic management system can promote the development of transportation, logistics, and various economic sectors, and has certain significance for the coordinated and balanced development of various regions and functions in a city. With the vigorous development of the intelligent transportation industry, the popularity of intelligent transportation systems is also increasing, and traffic control has also been a hot issue in recent years. Accurate and fast traffic prediction is the data-based foundation for achieving traffic control, and the accuracy of traffic congestion prediction results will directly affect the effectiveness of traffic control [4]. For travelers, traffic congestion prediction data can be used to formulate reasonable and convenient travel route planning and travel time planning. For traffic managers in major cities, traffic congestion prediction [5] can help the traffic management department grasp the traffic operation status in real time, control the traffic system reasonably and efficiently, improve the operational efficiency of the traffic system, and reduce and ease traffic congestion.

The main characteristics of the traffic speed prediction problem include: (1) The input variables and output variables have a nonlinear mapping relationship; (2) at the same time, the speed time series has a high degree of periodicity; and (3) there is a certain correlation between the speeds of different road sections. At present, in the field of traffic prediction, the main models used for prediction are divided into three categories. The first category comprises the models based on statistical theory, which assume that the future data to be predicted has the same characteristics as that of the past data [6]. This type of model mainly includes the ARIMA prediction model [7] for time-series analysis, Kalman filter analysis [8], etc. The second category of models contains deep learning models, which are becoming more and more widely used in machine learning. In addition to being applied in research on image recognition, speech recognition, and other aspects, the long short-term memory model (LSTM) is widely used in the field of traffic prediction because of its superior accuracy in time-series prediction [9].The third category of models is made up of integration models [10], which improve the prediction accuracy of a single model by combining and integrating multiple different basic learners of different or the same types of model [11].

The theme of this article is traffic speed prediction. This article references several papers from the Symmetry journal. Ge H et al. studied traffic flow prediction on highways [12]. Alajali, W et al. studied traffic prediction using decision trees [13]. Xing, Ban et al. proposed a symmetric extreme learning machine cluster fast learning method for traffic prediction [14]. These papers have provided enlightenment for the research of this article.

Currently, research on traffic speed prediction mainly adopts statistical learning methods, deep learning methods, and ensemble model methods. These methods are limited by their insufficient prediction accuracy and cannot comprehensively consider the nonlinear, periodic, and spatial characteristics of speed prediction. The optimization problem corresponding to the training process of existing neural network models is a non-convex optimization problem, and the parameters often converge to a local optimum point. This leads to insufficient prediction accuracy. Currently, linear statistical learning models can only describe linear mapping relationships, which is insufficient for prediction problems. However, the polynomial model proposed in this paper can solve both problems simultaneously.

Because polynomial functions can better approximate and fit most nonlinear functions, the parameter solving problem of polynomial functions can usually obtain the global optimal solution through optimization algorithms [15,16]. Approximating the nonlinear relationship of speed prediction using polynomial functions can improve the accuracy of speed prediction to a certain extent. Therefore, this article will propose a new traffic speed prediction model based on polynomial functions.

When studying the prediction of traffic speed on roads, we have gained inspiration from research related to traffic flow prediction, which suggests that considering the characteristics of space or different regions may be very beneficial [17,18]. However, due to the presence of a large number of spatial features and corresponding features from different regions, considering the integration of all of these features would make our model exceptionally large. Therefore, we need to consider screening and selecting features to ensure that our model can accurately predict traffic speed to the maximum extent possible under controllable conditions. Based on this reality, we will propose a new traffic speed feature selection method [19], which is based on the maximum mutual information coefficient.

At the same time, considering the periodic variation of road traffic speed time series can improve model prediction accuracy [20]. We will also introduce periodic trend features to model the periodic characteristics of the time series, in order to further improve the accuracy and robustness of the prediction model.

Section 2 of this article is a literature review. Section 3 is the method. Section 3.1 introduces the prediction model. Section 3.2 describes model training. Section 3.3 discusses spatial and temporal feature extraction. Section 3.4 discusses period feature extraction. Section 3.5 showcases the evaluation metrics. Section 3.6 describes model integration. Section 4 covers numerical testing. Section 4.1 is a description of the dataset used in this article. Section 4.2 contains spatial feature analysis. Section 4.3 discusses correlation network structure. Section 4.4 covers period analysis. Section 4.5 discusses prediction accuracy. Section 4.6 presents a model comparison. Section 5 is a summary of the article.

2. Literature Review

The prediction of road traffic speed is a typical traffic prediction problem. There are currently few documents in the field of road traffic speed prediction, so this article draws on the literature surrounding several other issues in traffic prediction. At present, many scholars have conducted research in the field of traffic prediction [21,22].

The travel time problem aims to estimate the travel time for a given OD input, which consists of a starting point, an ending point, and a departure time. In the field of travel time prediction, the main research methods include K-nearest neighbor regression [23], multi-layer perceptron, ResNet [24], long-short term memory network, and convolutional neural network [25]. The travel demand problem aims to predict the future traffic demand in each area of the city. In the field of travel demand forecasting, the main methods include multi-layer perceptron [26], convolutional neural network and recurrent neural network [27,28], graph convolutional network, and graph attention mechanism [29,30]. The regional traffic flow problem is defined as predicting the future traffic flow between regions. In the field of regional traffic forecasting, the main methods include the historical average, ARIMA, integrated methods [31,32], convolutional neural network [33], and long and short-term memory network [34]. In the field of network flow prediction, the main methods currently used include autoencoder [35], recurrent neural network [36], and meta learning [37].

In the field of speed prediction, the current mainstream methods include the historical average and ARMA [31]. In addition to the historical average method, some scholars currently use deep learning methods to study the problem of road traffic speed prediction, including convolutional neural network, recurrent neural network, and feedforward neural network [38,39], long short-term memory network, and graph convolutional neural network [40]. Some scholars [39] divide the traffic map into several rectangular grids, then calculate the average speed of each grid, and treat the data at a time point as an image or two-dimensional matrix. Then, they use convolutional neural network-related methods to extract features and use them for final speed prediction. Some scholars have used a bidirectional LSTM model with both forward and backward dependencies to consider the temporal characteristics of speed time series for road traffic speed prediction. Some scholars [39] combine CNN and RNN, first using CNN to capture the relevant features of road structure in speed sequences, and then using RNN to consider multiple factors for speed time-series prediction. Some scholars [40] have also introduced more data and factors, including geographic information attributes, road network structure, and social information, to improve the accuracy of speed prediction.

3. Method

This section will first discuss a prediction model based on polynomial functions, which can fit the nonlinear mapping of speed series. Second, this section will discuss the parameter training or parameter solving method of the model. The optimization problem of this model is a convex optimization problem. Therefore, the training of this model is relatively effective. Third, this section will discuss a spatial feature selection method based on the maximum mutual information coefficient. Fourth, this section will discuss a periodic feature extraction method based on the PCA method. Fifth, this section will discuss evaluation metrics for testing. Finally, this section will discuss how to combine polynomial functions, spatiotemporal features, and periodic features into a fusion model.

3.1. Prediction Model

Polynomial functions can provide a good approximation for most nonlinear functions, and can fit most nonlinear relationships. The training process of polynomial regression is a convex optimization problem that can obtain a global optimal solution [41,42]. This article will combine polynomial functions with Lasso regression to construct a polynomial regression model. A road speed prediction model is porposed based on polynomial regression. In addition, this paper will prove the improvement in prediction accuracy brought about by the spatial selection method and periodic feature modeling method based on a polynomial regression model. The input features of polynomial regression are as follows:

S_{i, t}^{i n p u t_{1}} = {S_{i, t - τ} | τ = [1, T]}

(1)

S_{i, t}^{i n p u t_{2}} = {S_{k, t - τ} | τ = [1, T_{k}], k = [1, n u m]}

(2)

S_{i, t}^{i n p u t_{3}} = S_{i, t}^{p}

(3)

S_{i, t}^{i n p u t} = [S_{i, t}^{i n p u t_{1}}, S_{i, t}^{i n p u t_{2}}, S_{i, t}^{i n p u t_{3}}]

(4)

where i is the road ID, t is the time ID, num is the total number of considered road sections, T is the maximum time lag considered, T_k is the maximum time lag considered for other regions,

S_{i, t}^{i n p u t_{1}}

is the basic variable part for predicting the i-th road section and the t-th time moment,

S_{i, t}^{i n p u t_{2}}

is the other road section feature part of the input,

S_{i, t}^{i n p u t_{3}}

is the periodic feature part, and

S_{i, t}^{i n p u t}

is the input of the merged model.

The features of the road section are the speed of the past T time intervals of the road section, the speed of the past T_k time intervals of the other road sections (how to select these features will be explained later), and the periodic feature is the speed of the periodic trend at the same time (how to extract the periodic trend will be introduced later). The features of the road segment itself, the features of other road segments, and periodic features are combined to form the final feature set.

The input components are recorded as follows:

S_{i, t}^{i n p u t} = [S_{i, t, 1}, S_{i, t, 2}, \dots, S_{i, t, M}]

(5)

where S_i_,t,m is the m-th component of the input variable, i is the segment ID, t is the time subscript, and m is the input component subscript. The polynomial regression model in this paper is as follows:

\begin{array}{l} S_{i, t}^{o u t p u t} = f (S_{i, t}^{i n p u t}) \\ = ω_{0} + \sum_{m = 1}^{M} ω_{m} \times S_{i, t, m} \\ + \sum_{m_{1} = 1}^{M} \sum_{m_{2} = m_{1}}^{M} ω_{m_{1}, m_{2}} \times S_{i, t, m_{1}} \times S_{i, t, m_{2}} \\ + \sum_{m_{1} = 1}^{M} \sum_{m_{2} = m_{1}}^{M} \sum_{m_{3} = m_{2}}^{M} ω_{m_{1}, m_{2}, m_{3}} \times S_{i, t, m_{1}} \times S_{i, t, m_{2}} \times S_{i, t, m_{3}} \end{array}

(6)

where

ω_{0}

is the bias term,

ω_{m}

is the m-th coefficient of the first-order component,

ω_{m_{1}, m_{2}}

is the coefficient of the second-order component,

ω_{m_{1}, m_{2}, m_{3}}

is the coefficient of the third order component,

S_{i, t}^{i n p u t}

is the input of the model,

S_{i, t}^{o u t p u t}

is the output of the model (the traffic speed of the i-th section at the t-th moment), and M is the number of variables input into the model. Due to differences in the input of prediction models for different road sections, different models are established for different road sections.

3.2. Model Solution

Solving model parameters is equivalent to optimizing the following optimization problems:

\min_{ω} J (S, ω) = \sum_{t = 1}^{T} {(S_{i, t}^{o u t p u t} - f_{i} (S_{i, t}^{i n p u t} | ω))}^{2} + λ | | ω | |_{1}

(7)

where f_i is the aforementioned polynomial regression model, and

ω

is the model parameter. The first part of the optimization objective is error minimization, and the second part is a norm regularization term. Because 1-norm regularization can adjust most parameters to 0, using 1-norm regularization can automatically filter high dimensional features. The gradient descent method is used to solve the model, and the overall steps of the Algorithm 1 are as follows [43].

Algorithm 1: Solving parameters of polynomial regression models.

Input: Training set for traffic speed dataset, regularization parameter

λ

, parameters of gradient descent method threshold

θ

.
Output: Model parameter

ω

.

Step 1. Calculate J(S,ω) based on historical data

Step 2. Calculate the partial derivative of each component of the parameter:

\frac{\partial J}{\partial ω_{0}} = \sum_{t = 1}^{T} 2 (f_{i} (S_{i, t}^{i n p u t} | ω) - S_{i, t}^{o u t p u t}) + λ I (ω_{0})

\frac{\partial J}{\partial ω_{m}} = \sum_{t = 1}^{T} 2 (f_{i} (S_{i, t}^{i n p u t} | ω) - S_{i, t}^{o u t p u t}) S_{i, t, m} + λ I (ω_{m})

\frac{\partial J}{\partial ω_{m_{1}, m_{2}}} = \sum_{t = 1}^{T} 2 (f_{i} (S_{i, t}^{i n p u t} | ω) - S_{i, t}^{o u t p u t}) S_{i, t, m_{1}} S_{i, t, m_{2}} + λ I (ω_{m_{1}, m_{2}})

\frac{\partial J}{\partial ω_{m_{1}, m_{2}, m_{3}}} = \sum_{t = 1}^{T} 2 (f_{i} (S_{i, t}^{i n p u t} | ω) - S_{i, t}^{o u t p u t}) S_{i, t, m_{1}} S_{i, t, m_{2}} S_{i, t, m_{3}} + λ I (ω_{m_{1}, m_{2}, m_{3}})

where

I (ω) = \{\begin{cases} 1, ω > 0 \\ 0, ω = 0 \\ - 1, ω < 0 \end{cases}

Step 3. According to the partial derivative and gradient, record the gradient as g, update the model parameters, and the calculation formula is as follows:

ω^{n e w} = ω^{o l d} - α g

Step 4. If

g \geq θ

then algorithm returns to step 1. If

g < θ

then algorithm stop iteration.

3.3. Spatiotemporal Feature Extraction

The maximum mutual information coefficient method [44] is a method to analyze the correlation between two variables. It helps to find the nonlinear correlation between two variables, so it can be widely used in variable selection in predictive models.

The basic principle of the maximum mutual information coefficient method is based on information theory. In information theory, information can be measured by entropy. The greater the entropy, the greater the amount of information. The maximum mutual information coefficient method measures the correlation between two variables by calculating the mutual information between them. Mutual information refers to the sharing degree of information between two variables. The greater the mutual information between two variables, the stronger the correlation between them.

S_{i} = [S_{i, 1}, S_{i, 2}, \dots, S_{i, T - 1}]

(8)

S_{j} = [S_{j, 2}, S_{j, 3}, \dots, S_{j, T}]

(9)

where S_i_,t is the traffic speed of the i-th section at the t-th time point. The maximum mutual information coefficients of section i and section j are defined as follows:

I (S_{i}, S_{j}) = \sum_{S_{i}, S_{j}} p (S_{i}, S_{j}) \log_{2} \frac{p (S_{i}, S_{j})}{p (S_{i}) p (S_{j})}

(10)

M I C_{i, j} = \max_{| S_{i} | | S_{j} | < B} (\frac{I (S_{i}, S_{j})}{\log_{2} \min (| S_{i} |, | S_{j} |)})

(11)

where B is the hyper parameter that limits the number of grids. The MIC values of the target road segment and candidate feature road segments are calculated, and road segments with MIC values greater than a threshold are used as input variables for the prediction model. If there are too many road sections greater than the threshold, the first K road sections with the highest nonlinear correlation coefficient are introduced as input features. If the number of road sections greater than the threshold is less than K, other road sections are not considered when establishing the model.

3.4. Periodic Feature Extraction

Due to the strong periodicity of road speeds, there is a strong similarity in the speed curves of different days. This article is based on principal component analysis [45] to extract the periodic patterns of velocity curves. Firstly, a data matrix is constructed, where

S_{t, d}^{i}

represents the road traffic speed for the i-th section, t-th time period, and d-th day. Thus:

S = [\begin{matrix} S_{1, 1}^{i} & \dots & S_{1, D}^{i} \\ \dots & \dots & \dots \\ S_{T, 1}^{i} & \dots & S_{T, D}^{i} \end{matrix}]

(12)

The sample data in data set S is standardized according to the following formula:

S_{t, d}^{i} = \frac{S_{t, d}^{i} - μ_{d}}{σ_{d}}

(13)

where μ_d is the mean value of day d, and σ_d is the standard deviation of day d. The covariance matrix is calculated according to the data matrix S:

D = \frac{1}{m} S^{T} S

(14)

All the eigenvalues of the covariance matrix D are found, and these eigenvalues are arranged from largest to the smallest. Then, the eigenvectors corresponding to the first K features are selected and arranged in rows to form the transformation matrix W. The dimension of the data is reduced using the transformation matrix W. In the process of using the PCA algorithm, the fractional matrix SC and coefficient matrix C are obtained. In order to obtain the final trend, the first K columns of the SC matrix and the first K rows of the coefficient matrix C are taken, and then matrix multiplication is performed on these two matrices to obtain the results of the PCA periodic rule. The calculation formula is as follows:

p c a = m e a n (S C^{K} \times C^{K})

(15)

It is worth noting that when extracting cycle features, the time series is divided into two parts: weekday and weekend, extracting weekday cycle trends and weekend trends, respectively. When predicting the traffic speed on weekdays, this article uses the periodic trend of weekdays as a feature. When predicting the traffic speed on weekends, this article uses the periodic trend of holidays as a feature. Due to the significant difference in speed between non-peak and peak periods, the periodic trend feature proposed in this paper can effectively address this issue.

3.5. Evaluating Indicator

The evaluation index adopted in this paper is the percentage of mean absolute (MAPE). The MAPE value of the i-th road is calculated as follows:

M A P E = \frac{1}{T} \sum_{t = 1}^{T} | S_{i, t} - S r_{i, t} | / S r_{i, t} \times 100 %

(16)

where Sr_i_,t is the observed value of road traffic speed, S_i_,t is the predicted value of road traffic speed, and T is the number of time points. In addition, the improvement amount of model 2 relative to model 1 is defined as follows:

I M P_{1, 2} = (M A P E_{1} - M A P E_{2}) / M A P E_{1} \times 100 %

(17)

where MAPE₁ and MAPE₂ are the percentage of the mean absolute of model 1 and model 2, respectively. IMP_1,2 represents the improvement amount of the model.

3.6. Model Integration

This article combines the polynomial model, spatial feature selection method, and periodic trend extraction to form a spatiotemporal periodic polynomial model. Firstly, based on the spatial feature filtering method, the road sections which are useful for traffic prediction are selected, and the speed values of these road sections are taken as inputs for the model. Secondly, the periodic trend extraction method is used to extract the periodic trend, and the velocity values at the same time in different periods are taken as inputs for the model. Then, the time, space, and periodic features are used as inputs for the polynomial model. Finally, the model in this article is trained using historical data and evaluated on the test set.

4. Numerical Testing

This section will conduct numerical experiments in traffic speed prediction based on real data sets. First, this section will introduce the description of the data set. Second, this section will show the spatial correlation between different roads. Third, this section will show the network structure of spatial correlation analysis. Fourth, this section will conduct a periodic analysis of speed time series. Fifth, this section will analyze the prediction accuracy of the proposed method and show the improvement in prediction accuracy brought about by polynomial functions, spatial features, and periodic features. Sixth, this section will compare the prediction accuracy of the proposed method with several deep learning models, statistical learning models, and ensemble models. Finally, this section will analyze time complexity and spatial complexity.

4.1. Data Set

The data set consisted of 214 anonymous road sections (mainly main roads and expressways) in Guangzhou [46], with measured vehicle speed data spanning two months (1 August 2016 to 30 September 2016), with a time window of 10 min. The vehicle speed data set contained a total of 1,855,589 vehicle speed records. The fields it contained were as follows: (1) Road_id: The number of the road section, for example, “1” represents the first road section; (2) Day_id: The number of the day, where the number “1” represents 1 August 2016, the number “2” represents 2 August 2016, and so on, and the number “61” represents 30 September 2016; (3) Time_id: The number of the time window, where the number “1” represents 00:00:00–00:10:00, and the number “2” represents 00:10:00–00:20:00; and (4) Speed: Vehicle speed value (in km/h). The speed data in this article were collected through induction coils installed on the road. The speed of each road segment every 10 min was the average speed of passing vehicles measured by the induction coil. The top 10 data of the road speed dataset are shown in Table 1.

The dataset used in this article did not contain specific location information for road segments, only the IDs of road segments. However, the method in this article considered the correlation between different road segments through the MIC coefficient, which to some extent made up for this problem.

Figure 1 and Figure 2 show the speed variation curves of multiple road sections contained in the dataset of this article, as well as the speeds at different road sections at the same time. As can be seen from Figure 1, there was a strong periodic characteristic in the speed time series. As can be seen from Figure 2, there were significant differences in the traffic speed at different road sections at the same time.

Figure 3 shows the variation pattern of traffic speed on sections 1–9 at different times on different days. As can be seen from the figure, there was a strong periodicity in the time of each road segment. The model in this article significantly improved the accuracy of model prediction by considering the periodic characteristics of time. At the same time, as can be seen from the figure, there was also a strong correlation between the time series of different road segments. The model in this article significantly improved the accuracy of model prediction by considering spatial characteristics.

4.2. Spatial Feature Analysis

Based on the calculation method of maximum mutual information coefficient mentioned above, it was used to calculate the nonlinear correlation between different regions. For data analysis, relevant data from the top 10 roads in the sample were selected for processing. By calculating the correlation between these data, the display results shown in Figure 4 were obtained. As can be seen from Figure 4, there was a strong correlation between some road segments. Therefore, considering spatial feature selection improved the prediction accuracy of the model.

4.3. Dependency Network Structure

Figure 5 shows the dependency relationship between different roads. For each dependent variable, nodes with a MIC value greater than 0.8 were selected, and the connection was treated as having a dependency relationship, while the other nodes were treated as having no relationship. The network is visualized in Figure 5.

From the graph, it can be seen that there was a certain degree of nonlinear correlation between the traffic speeds in different regions. Therefore, when predicting the traffic speed of a region, introducing speed data from other regions can effectively improve the prediction accuracy.

4.4. Periodicity Analysis

In this study, a statistical graph of the first road section was plotted to demonstrate the variation pattern of road vehicle speed from 8 August 2016 to 12 August 2016. Specifically, the data was processed at 10 min intervals, using different date IDs to distinguish different traffic data, and displaying the data in different colors. The horizontal axis represented the time ID, while the vertical axis represented the vehicle speed. In Figure 6, the curves of different date IDs show similar characteristics, and overall, it can be seen that there was a significant periodic trend in road traffic speed on weekdays.

Using the cycle extraction method from the previous section, it was first necessary to collect data on the target road to obtain data on travel time and distance. On this basis, the periodic pattern of the target road was extracted by processing the data. For road 1, after data processing and period mining, a daily periodic speed curve and a speed curve from 8 August to 12 August were obtained, as shown in Figure 7. The green curve displays a periodic speed curve, while the red point represents the speed curve from 8 August to 12 August. From Figure 7, it can be seen that the cycle law extraction method proposed in this article can efficiently and accurately extract the cycle law of road traffic speed, providing an important data foundation for subsequent predictions of road traffic speed.

4.5. Prediction Accuracy

In this experiment, the data from the previous 48 days (1 August 2016 to 17 September 2016) was taken as the training set, and the data from the following 13 days (18 September 2016 to 30 September 2016) was taken as the test set. According to our experience, the order of the polynomial model can lead to overfitting if it is too large, and it can be equivalent to a linear model if it is too small. In this article, the order of the polynomial model was set to three. When calculating cycle trends, they were divided into weekdays and weekends. The time lag length of the road itself was set to five. The nonlinear correlation coefficient threshold was set to 0.8. The maximum number of nodes considered was three. If the number of independent variables was greater than three, the three roads with the highest coefficients were considered. If the number of independent variables greater than the threshold was less than three, only the univariate model was considered.

Figure 8 shows the comparison between the predicted and observed values of the nonlinear spatial period prediction model on the test set. The data points in the figure represent observed values and predicted values. The closer their values are to the straight line y = x, the better the prediction accuracy. It can be seen from the figure that the method proposed in this article can effectively achieve traffic speed prediction.

In this section, this article will compare the prediction accuracy improvement brought about by spatiotemporal feature extraction methods and periodic feature extraction methods compared to univariate models. The polynomial model in Table 2 only considers the time features of a single variable. The spatiotemporal polynomial model considers spatiotemporal features, while the periodic polynomial model considers periodic features. The spatiotemporal periodic polynomial model considers both spatiotemporal and periodic features. Table 2 shows the comparison of prediction accuracy between univariate models and models considering spatial features and periodic features. As can be seen from Table 2, introducing periodicity features can effectively improve the prediction accuracy on most road segments. At the same time, introducing spatial features can also enhance the model features. A model that considers both periodicity and spatial features can achieve better results.

The model evaluation criteria and model improvement calculation formula in Section 3.5 was used to calculate the improvement in prediction accuracy of the periodic model and spatial model compared to the univariate model. The results are shown in Figure 9 and Figure 10. For most roads, spatial models have been shown to improve prediction accuracy. Similarly, for most roads, periodic models also improve the prediction accuracy of the model.

From Table 2 and Figure 9 and Figure 10, it can be seen that the predictive accuracy of the univariate model in this article can meet practical needs. For all roads, after introducing spatial features, the prediction accuracy was significantly improved, with an average optimization of 1.41%. For all roads, after introducing periodic features, the prediction accuracy was again significantly improved, with an average optimization of 1.5%. For all roads, the model incorporating periodic trend extraction and the maximum mutual information coefficient exhibited an additional 2.24% improvement in accuracy compared to the univariate polynomial regression model.

4.6. Model Comparison

To further verify the accuracy of our method, we compared our model with deep learning models, machine learning models, and ensemble learning models. The assessed deep learning models include feedforward neural networks (FNN) [47] and long short-term memory networks (LSTM) [48].

The neurons of the FNN are arranged in layers, and each neuron is only connected to the neurons in the previous layer. It receives the output of the previous layer and outputs it to the next layer, without feedback between layers. It is one of the most widely used and rapidly developing artificial neural networks. The hyperparameters and options for the FNN were set as follows: the FNN consisted of four fully connected layers and four relu activation layers. The number of neurons in the hidden layer was set to 64, the loss function was MAE, and the optimizer used an Adam optimizer.

LSTM is a special RNN network designed to solve long-term dependencies. This network has been used to solve a variety of problems and is still widely used today. The hyperparameters and options for LSTM were set as follows: The number of hidden layer units in the LSTM model was set to 500. The input was a time series with a time length of 5 and a feature dimension of 214 (number of road segments), and the output was a vector with a number of 214 elements. The activation function was the relu function, the loss function was the MAE, and the optimizer was the Adam optimizer.

The assessed machine learning models included robust linear regression (RLR) [49] and K-nearest neighbor (KNN) [50]. RLR is a form of regression analysis that aims to overcome some limitations of traditional parametric and non-parametric methods, and aims to be less affected by violations of regression assumptions in the underlying data generation process. The KNN algorithm is an instance-based machine learning method that can be used for classification and regression problems. The idea of it is very simple, but it has shown excellent accuracy in practice. The fitting weight function of the RLR assessed here was bissquare, and the penalty constant was set to 10-3. The number of neighboring points in KNN was 5, with uniform weights and Minkowski distance used as the distance indicator.

The ensemble learning models compared here include random forest [51] and adaboost models [52]. Random forest refers to a predictor that uses multiple trees to train and predict samples. Adaboost is an iterative algorithm, the core idea of which is to train different prediction periods (weak predictors) for the same training set, and then combine these weak predictors to form a stronger final predictor (strong predictor). In the random forest model, the number of learners was set to 30, the maximum depth of the decision tree was set to 5, and the segmentation metric was set to squared error. The base learner of the adaboost model was set to a decision tree, the maximum depth of the decision tree was set to 5, the segmentation metric was set to squared error, and the number of learners was set to 30.

All models were tested on 214 road sections. In the experiment, the data from the first 48 days was used as the training set (1 August 2016 to 17 September 2016), and the data from the last 13 days was used as the test set (18 September 2016 to 30 September 2016). This paper used MAPE as the evaluation indicator, and the comparison results are shown in Table 3. As can be seen from Table 3, the MAPE value of the FNN model was 7.07%, the MAPE value of the LSTM model was 9.18%, the result of the RLR model was 8.11%, the prediction result of the KNN was 7.38%, the result of random forest was 7.12%, and the result of adaboost was 7.44%. Our model achieved a MAPE value of 6.65%, which was lower than all the comparison models. From the table, it can be seen that the model in this paper has improved prediction accuracy by 5.94% compared to the FNN model.

4.7. Time Complexity and Space Complexity

To analyze the time complexity of the model more comprehensively, the following aspects were analyzed: the time spent on extracting features from the entire dataset, the time spent on training the model on the entire dataset, and the time required for making a prediction with the model. To analyze the space complexity of the model, the memory size occupied by the model parameters was used. The main frequency of the computer was 1.9 Hz. The memory size was 16 GB, and the operating system was 64-bit. The results are shown in Table 4. From the table, it can be seen that the method in this article can efficiently complete training and testing using current computers and configurations.

5. Conclusions

This paper proposed a new polynomial model integrating temporal, spatial, and periodic features for road traffic speed prediction. The method proposed in this paper was shown to effectively improve the prediction accuracy of traffic speed. Aiming at the problem of nonlinear modeling of traffic speed, a road speed prediction method based on a polynomial regression model was proposed. According to our research, we integrated the spatial features of traffic speed into our model. In order to obtain a model that considered both spatial features and did not contain too many variables, the spatial feature selection method of traffic speed was studied, and a spatial feature selection method based on the maximum mutual information coefficient was proposed. Given the strong periodicity of road traffic speed, according to our research, introducing periodicity features into a traffic speed prediction model can effectively improve prediction accuracy. Therefore, we also analyzed the periodic mode of traffic speed and introduced periodic features. In numerical experiments, the prediction accuracies of a univariate polynomial model, a spatiotemporal polynomial model, a periodic polynomial model, and a spatiotemporal periodic polynomial model were evaluated.

Compared with the univariate polynomial model, the prediction accuracy of the spatio-temporal polynomial model was optimized by 1.41% on average. The prediction accuracy of the periodic polynomial model was optimized by 1.5% on average, and the accuracy of the spatiotemporal periodic polynomial model was improved by 2.24% on average. The test results showed that the three methods proposed in this paper can effectively improve prediction accuracy. In addition, our model was also compared with FNN, LSTM, RLR, and KNN regression. Our model was shown to be superior to the existing general machine learning model and deep learning model. The test results showed that the prediction accuracy of this model was 5.94% better than that of the most advanced FNN.

The spatiotemporal periodic polynomial model in this article can achieve better results on data with complex mapping relationships, strong periodicity, and strong spatial correlation. Therefore, this method can provide more accurate road traffic speed prediction results, which has important practical significance for the implementation of intelligent traffic management systems and a reduction in traffic congestion. At present, the method in this article can only predict the road traffic speed in the next 10 min. In subsequent research, we will further discuss how to predict the road traffic speed for a longer period of time.

Author Contributions

Conceptualization, S.J. and Y.F.; methodology, S.J.; validation, S.J.; formal analysis, S.J. and Y.F.; data curation, S.J.; writing—original draft preparation, S.J.; writing—review and editing, S.J., Y.F., X.L., H.W., J.L. and B.O.O.; visualization, S.J.; supervision, Y.F. and X.L.; project administration, Y.F. and X.L.; funding acquisition, Y.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M202201204), the Rural Revitalization Special Project of Chongqing Science and Technology Bureau (No. CSTB2023TIAD-ZXX0017), the Foundation of Intelligent Ecotourism Subject Group of Chongqing Three Gorges University (No. zhlv20221028).

Data Availability Statement

The data that support the findings of this study are available from OpenITS Open Data, but restrictions apply to the availability of these data, which were used under license for the current study, and so the data are not publicly available. Data are however available from the authors upon reasonable request and with the permission of OpenITS Open Data.

Acknowledgments

The authors gratefully acknowledge the OpenITS Open Data for their effort in making the data available.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yu, Z.; Zhao, P. The factors in residents’ mobility in rural towns of China: Car ownership, road infrastructure and public transport services. J. Transp. Geogr. 2021, 91, 102950. [Google Scholar] [CrossRef]
Map, A. Traffic Analysis Report of Major Cities in China; Gaode Map: Beijing, China, 2018. [Google Scholar]
Wei, H.; Nian, M.; Li, L. China’s strategies and policies for regional development during the period of the 14th five-year plan. Chin. J. Urban. Environ. Stud. 2020, 8, 2050008. [Google Scholar] [CrossRef]
Li, G.; Liao, Y.; Guo, Q.; Shen, C.; Lai, W. Traffic crash characteristics in Shenzhen, China from 2014 to 2016. Int. J. Environ. Res. Public Health 2021, 18, 1176. [Google Scholar] [CrossRef] [PubMed]
Akhtar, M.; Moridpour, S. A review of traffic congestion prediction using artificial intelligence. J. Adv. Transp. 2021, 2021, 8878011. [Google Scholar] [CrossRef]
Barros, J.; Araujo, M.; Rossetti, R.J. Short-term real-time traffic prediction methods: A survey. In Proceedings of the 2015 IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), Budapest, Hungary, 3–5 June 2015; pp. 132–139. [Google Scholar]
Alghamdi, T.; Elgazzar, K.; Bayoumi, M.; Sharaf, T.; Shah, S. Forecasting traffic congestion using ARIMA modeling. In Proceedings of the 2019 IEEE 15th International Wireless Communications and Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019; pp. 1227–1232. [Google Scholar]
Zhou, T.; Jiang, D.; Lin, Z.; Han, G.; Xu, X.; Qin, J. Hybrid dual Kalman filtering model for short-term traffic flow forecasting. IET Intell. Transp. Syst. 2019, 13, 1023–1032. [Google Scholar] [CrossRef]
Wang, Z.; Thulasiraman, P. Foreseeing congestion using LSTM on urban traffic flow clusters. In Proceedings of the 2019 6th International Conference on Systems and Informatics (ICSAI), Shanghai, China, 2–4 November 2019; pp. 768–774. [Google Scholar]
Zheng, G.; Chai, W.K.; Katos, V.; Walton, M. A joint temporal-spatial ensemble model for short-term traffic prediction. Neurocomputing 2021, 457, 26–39. [Google Scholar] [CrossRef]
Chen, X.; Zhang, S.; Li, L. Multi-model ensemble for short-term traffic flow prediction under normal and abnormal conditions. IET Intell. Transp. Syst. 2019, 13, 260–268. [Google Scholar] [CrossRef]
Ge, H.; Huang, M.; Lu, Y.; Yang, Y. Study on traffic conflict prediction model of closed lanes on the outside of expressway. Symmetry 2020, 12, 926. [Google Scholar] [CrossRef]
Alajali, W.; Zhou, W.; Wen, S.; Wang, Y. Intersection traffic prediction using decision tree models. Symmetry 2018, 10, 386. [Google Scholar] [CrossRef]
Xing, Y.; Ban, X.; Liu, X.; Shen, Q. Large-scale traffic congestion prediction based on the symmetric extreme learning machine cluster fast learning method. Symmetry 2019, 11, 730. [Google Scholar] [CrossRef]
Yu, Q.; Lin, Q.; Zhu, Z.; Wong, K.-C.; Coello, C.A.C. A dynamic multi objective evolutionary algorithm based on polynomial regression and adaptive clustering. Swarm Evol. Comput. 2022, 71, 101075. [Google Scholar] [CrossRef]
Pang, Y.; Shi, M.; Zhang, L.; Song, X.; Sun, W. PR-FCM: A polynomial regression based fuzzy C-means algorithm for attribute-associated data. Inf. Sci. 2022, 585, 209–231. [Google Scholar] [CrossRef]
Loo, B.P.Y.; Huang, Z. Spatio-temporal variations of traffic congestion under work from home (WFH) arrangements: Lessons learned from COVID-19. Cities 2022, 124, 103610. [Google Scholar] [CrossRef] [PubMed]
Yao, H.; Wu, F.; Ke, J.; Tang, X.; Jia, Y.; Lu, S.; Li, Z. Deep multi view spatial temporal network for taxi demand prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; p. 32. [Google Scholar]
Peacock, D.E.; Hu, G. Analyzing grammy, emmy, and academy awards data using regression and maximum information coefficient. In Proceedings of the 2013 Second IIAI International Conference on Advanced Applied Informatics (IIAIAAI), Los Alamitos, CA, USA, 31 August–4 September 2013; pp. 74–79. [Google Scholar]
Li, L.; Su, X.; Zhang, Y.; Lin, Y.; Li, Z. Trend modeling for traffic time series analysis: An integrated study. IEEE Trans. Intell. Transp. Syst. 2015, 16, 3430–3439. [Google Scholar] [CrossRef]
Gheibi, M.; Karrabi, M.; Latifi, P.; Fathollahi-Fard, A.M. Evaluation of traffic noise pollution using geographic information system and descriptive statistical method: A case study in Mashhad, Iran. Environ. Sci. Pollut. Res. 2022, 1–14. [Google Scholar] [CrossRef]
Shi, L.; Liu, M.; Liu, Y.; Zhao, Q.; Cheng, K.; Zhang, H.; Fathollahi-Fard, A.M. Evaluation of urban traffic accidents based on pedestrian landing injury risks. Appl. Sci. 2022, 12, 6040. [Google Scholar] [CrossRef]
Wang, H.; Tang, X.; Kuo, Y.H.; Kifer, D.; Li, Z. A simple baseline for travel time estimation using large-scale trip data. ACM Trans. Intell. Syst. Technol. (TIST) 2019, 10, 1–22. [Google Scholar] [CrossRef]
Li, Y.; Fu, K.; Wang, Z.; Shahabi, C.; Ye, J.; Liu, Y. Multi-task representation learning for travel time estimation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1695–1704. [Google Scholar]
Yuan, H.; Li, G.; Bao, Z.; Feng, L. Effective travel time estimation: When historical trajectories over road networks matter. In Proceedings of the 2020 ACM Sigmod International Conference on Management of Data, New York, NY, USA, 14–19 June 2020; pp. 2135–2149. [Google Scholar]
Wang, D.; Cao, W.; Li, J.; Ye, J. DeepSD: Supply-demand prediction for online car-hailing services using deep neural networks. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering (ICDE), San Diego, CA, USA, 19–22 April 2017; pp. 243–254. [Google Scholar]
Bai, L.; Yao, L.; Kanhere, S.S.; Yang, Z.; Chu, J.; Wang, X. Passenger demand forecasting with multi-task convolutional recurrent neural networks. In Proceedings of the Advances in Knowledge Discovery and Data Mining: 23rd Pacific-Asia Conference, PAKDD 2019, Macau, China, 14–17 April 2019; Proceedings, Part II 23. pp. 29–42. [Google Scholar]
Kuang, L.; Yan, X.; Tan, X.; Li, S.; Yang, X. Predicting taxi demand based on 3D convolutional neural network and multi-task learning. Remote Sens. 2019, 11, 1265. [Google Scholar] [CrossRef]
Geng, X.; Li, Y.; Wang, L.; Zhang, L.; Yang, Q.; Ye, J.; Liu, Y. Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 3656–3663. [Google Scholar]
Xu, Y.; Li, D. Incorporating graph attention and recurrent architectures for city-wide taxi demand prediction. ISPRS Int. J. Geo-Inf. 2019, 8, 414. [Google Scholar] [CrossRef]
Cryer, J.D. Time Series Analysis; Duxbury Press: Boston, MA, USA, 1986; p. 286. [Google Scholar]
Leshem, G.; Ritov, Y.A. Traffic flow prediction using adaboost algorithm with random forests as a weak learner. Int. J. Math. Comput. Sci. 2007, 1, 1–6. [Google Scholar]
Zhang, J.; Zheng, Y.; Qi, D. Deep spatio-temporal residual networks for citywide crowd flows prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; p. 31. [Google Scholar]
He, Z.; Chow, C.Y.; Zhang, J.D. STCNN: A spatio-temporal convolutional neural network for long-term traffic prediction. In Proceedings of the 2019 20th IEEE International Conference on Mobile Data Management (MDM), Hong Kong, China, 10–13 June 2019; pp. 226–233. [Google Scholar]
Lv, Y.; Duan, Y.; Kang, W.; Li, Z.; Wang, F.Y. Traffic flow prediction with big data: A deep learning approach. IEEE Trans. Intell. Transp. Syst. 2014, 16, 865–873. [Google Scholar] [CrossRef]
Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 33, pp. 922–929. [Google Scholar]
Pan, Z.; Liang, Y.; Wang, W.; Yu, Y.; Zheng, Y.; Zhang, J. Urban traffic prediction from spatio-temporal data using deep meta learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 4–8 August 2019; pp. 1720–1730. [Google Scholar]
Lv, Z.; Xu, J.; Zheng, K.; Yin, H.; Zhao, P.; Zhou, X. Lc-rnn: A deep learning model for traffic speed prediction. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; Volume 2018, p. 27. [Google Scholar]
Ma, X.; Dai, Z.; He, Z.; Ma, J.; Wang, Y.; Wang, Y. Learning traffic as images: A deep convolutional neural network for large-scale transportation network speed prediction. Sensors 2017, 17, 818. [Google Scholar] [CrossRef]
Liao, B.; Zhang, J.; Wu, C.; McIlwraith, D.; Chen, T.; Yang, S.; Guo, Y.; Wu, F. Deep sequence learning with auxiliary information for traffic prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018; pp. 537–546. [Google Scholar]
De Castro, Y.; Gamboa, F.; Henrion, D.; Hess, R.; Lasserre, J.B. Approximate optimal designs for multivariate polynomial regression. Ann. Stat. 2019, 47, 127–155. [Google Scholar] [CrossRef]
Gao, W.; Fan, H. Omni-channel customer experience (in) consistency and service success: A study based on polynomial regression analysis. J. Theor. Appl. Electron. Commer. Res. 2021, 16, 1997–2013. [Google Scholar] [CrossRef]
Haji, S.H.; Abdulazeez, A.M. Comparison of optimization techniques based on gradient descent algorithm: A review. PalArch’s J. Archaeol. Egypt/Egyptol. 2021, 18, 2715–2743. [Google Scholar]
Xu, H.; Zhang, Y.; Liu, J.; Lv, D. Feature Selection Using Maximum Feature Tree Embedded with Mutual Information and Coefficient of Variation for Bird Sound Classification. Math. Probl. Eng. 2021, 1–14. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, Y. Forecasting crude oil futures market returns: A principal component analysis combination approach. Int. J. Forecast. 2023, 39, 659–673. [Google Scholar] [CrossRef]
OpenITS Org. OpenData V12.0-Large-Scale Traffic Speed Data Set. Available online: https://www.openits.cn/openData2/792.jhtml (accessed on 30 April 2021).
Sazli, M.H. A brief review of feed-forward neural networks. Commun. Fac. Sci. Univ. Ank. Ser. A2–A3 Phys. Sci. Eng. 2006, 50, 11–17. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Yu, C.; Yao, W. Robust linear regression: A review and comparison. Commun. Stat.-Simul. Comput. 2017, 46, 6261–6282. [Google Scholar] [CrossRef]
Laloë, T. A k-nearest neighbor approach for functional regression. Stat. Probab. Lett. 2008, 78, 1189–1193. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Cao, Y.; Miao, Q.-G.; Liu, J.-C.; Gao, L. Advance and prospects of AdaBoost algorithm. Acta Autom. Sin. 2013, 39, 745–758. [Google Scholar] [CrossRef]

Figure 1. Speed variation curves for multiple road sections. Taking the previous 16 road sections as an example, a total of 16 time series of traffic speeds from 8 August 2016 to 12 August 2016 are displayed.

Figure 2. Traffic speeds of different road sections at the same time. The time is from 8 August 2016 to 12 August 2016. Displays the road traffic speed of all sections at 9:00 am every day.

Figure 3. The variation pattern of traffic speed on sections 1–9. The x-axis represents the time id, the y-axis represents the day id, and the z-axis represents the road speed. The closer the color is to yellow, the higher the speed value. The closer the color is to blue, the lower the speed value.

Figure 4. Cross correlation coefficients between different regions. The closer the color is to blue, the closer the value is to 1. The closer the color is to white, the closer the value is to 0.

Figure 5. Dependency relationships between different regions (using the previous 30 road sections as an example).

Figure 6. Speed curves of road 1 from 8 August 2016 to 12 August 2016. Different color curves represent different days.

Figure 7. Calculation results for the periodic trend of road 1.The red curve represents the daily speed curve. The green curve represents the extracted periodic trend.

Figure 8. Comparison between the predicted value of the prediction model and the observed value.

Figure 9. Model improvements for spatiotemporal polynomial model on different road segments.

Figure 10. Improvements for periodic polynomial model on different road segments.

Table 1. Top 10 data of road speed dataset. The dataset contained a total of 1,855,589 vehicle speed records.

Road_id	Day_id	Time_id	Speed
1	1	1	40.893
1	1	2	41.938
1	1	3	44.098
1	1	4	44.483
1	1	5	44.172
1	1	6	44.416
1	1	7	43.622
1	1	8	44.202
1	1	9	42.898
1	1	10	44.123

Table 2. Comparison of MAPE results between univariate model and periodic space model. Take the 9 road sections as an example.

Road Segment ID	1	2	3	4	5	…	211	212	213	214
Polynomial Model	5.01	5.91	7.47	8.48	5.20	…	3.08	9.04	4.26	2.81
Periodic polynomial model	4.90	5.81	7.26	8.28	5.09	…	3.05	8.95	4.20	2.79
Spatiotemporal polynomial model	4.93	5.76	7.22	8.26	5.13	…	3.05	9.05	4.21	2.74
Spatiotemporal periodic polynomial model	4.86	5.74	7.10	8.13	5.09	…	3.03	8.96	4.17	2.79

Table 3. Model comparison results.

Model	MAPE
FNN	7.07%
LSTM	9.18%
RLR	8.11%
KNN	7.38%
Random forest	7.12%
Adaboost	7.44%
Spatiotemporal periodic polynomial model	6.65%

Table 4. Analysis of model time complexity and space complexity.

Evaluating Indicator	Value
Extract feature time (entire dataset)	98 s
Training time (entire dataset)	181 s
Prediction time (1 time point)	42.8 ms
Model size (entire data set)	212 KB

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, S.; Feng, Y.; Liao, X.; Wu, H.; Liu, J.; Onasanya, B.O. A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed. Symmetry 2024, 16, 537. https://doi.org/10.3390/sym16050537

AMA Style

Jiang S, Feng Y, Liao X, Wu H, Liu J, Onasanya BO. A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed. Symmetry. 2024; 16(5):537. https://doi.org/10.3390/sym16050537

Chicago/Turabian Style

Jiang, Shan, Yuming Feng, Xiaofeng Liao, Hongjuan Wu, Jinkui Liu, and Babatunde Oluwaseun Onasanya. 2024. "A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed" Symmetry 16, no. 5: 537. https://doi.org/10.3390/sym16050537

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Spatiotemporal Periodic Polynomial Model for Predicting Road Traffic Speed

Abstract

1. Introduction

2. Literature Review

3. Method

3.1. Prediction Model

3.2. Model Solution

3.3. Spatiotemporal Feature Extraction

3.4. Periodic Feature Extraction

3.5. Evaluating Indicator

3.6. Model Integration

4. Numerical Testing

4.1. Data Set

4.2. Spatial Feature Analysis

4.3. Dependency Network Structure

4.4. Periodicity Analysis

4.5. Prediction Accuracy

4.6. Model Comparison

4.7. Time Complexity and Space Complexity

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI