Due to a rapid development of modern traffic, there is an explosion of traffic flows. As a result, more and more air pollution and wasted energy are caused by stop-and-go driving behaviours at intersections. To mitigate such problems, some smart applications have been developed, e.g., Green Light Optimal Speed Advisory (see [
1]) for vehicles to avoid unnecessary stops at signalized intersections. The precondition for all these smart applications is that the signals must be known in advance. Normally, the information can be obtained by Signal Phase and Timing messages broadcast by road side units in modern Cooperative Intelligent Transport Systems. As shown in
Figure 1, it is a basic communication structure for V2I (Vehicle to Infrastructure) based on IoT (Internet of Things). The field device is constructed by a TLC (Traffic Light Controller) connected with a RSU (Road Side Unit), which first broadcasts current MAP (map as intersection geometry) and SPAT (Signal Phase and Timing) messages. Then, the traffic center delivers some application data from TLC to the public transport strategy computer, which finally generates future calculated MAP and SPAT messages. Therefore, both the current and future SPAT messages can be received by an OBU (On-Board Unit) implemented in a vehicle for further processing. However, more traffic flows are competing to request the signal messages, such as autonomous vehicles and public transport. The future traffic signals can be affected by sensors detecting vehicles in line. On the one hand, in such case, the priority of public transport cannot be guaranteed definitely. On the other hand, it increases the risk of such a radio-based communication and the cost of large amounts of communication modules implemented in intelligent transport systems. Therefore, methods to predict future traffic signals to avoid a heavy direct communication with infrastructures are being explored.
Previously, the main method to forecast upcoming traffic signal was the mathematical and statistical approach. Wang et al. [
2] used Kalman Filter to predict traffic state. Menig et al. [
3] adopted Markow chains to calculate the probabilities of occurrence of several signal states. However, these approaches can only produce unsatisfactory accuracy and transportability for actuated traffic systems, in which traffic signal changes are dependent on the requests from different traffic flows. Later, due to the explosion of the large data pool collected by different detectors, machine learning models attracted more attention [
4]. Weisheit and Hoyer [
5] applied Support Vector Machines to predict future possible traffic states, where the states were divided into different possible groups for classification. Heckmann et al. [
6] further defined stages to group-related signal states that can forecast three states in advance. The authors viewed signal prediction as a regression problem, and compared the performance of different combinations of Extreme-Gradient-Boosting and Bayesian Networks (see [
7]). However, these works have to assume that the traffic cycle time is fixed, which is not applicable for actuated traffic signals. Another research perspective is to view signal prediction as a time-series forecasting problem [
8,
9]. Khosravi et al. [
10] used machine learning to predict time-series wind speed data of a wind farm in Brazil. The researchers compared Adaptive Neuro-Fuzzy Inference System and hybrid models, Multilayer Feed-Forward Neural Network, Support Vector Regression, Fuzzy Inference System, and Group Method of Data Handling type neural network, which provided a possibility to deal with traffic signals as time-series data. Genser et al. [
11] made efforts to standardize Signal Phase and Timing messages to forecast the residue time of each phase. They applied a Random Survival Forest model to forecast time to green compared with the baseline models of Auto-Regressive Integrated Moving Average and Linear Regression. They mentioned the high potential of the Long Short-Term Memory (LSTM) model dealing with such time-series problem. Zhou et al. [
12] proposed the Informer to solve the problem of long sequence time-series forecasting. It is a modified Transformer that increases the prediction capacity. It was successfully applied to predict electricity consumption for a long period. Tang et al. [
13] rethought one-dimensional convolutional neural networks (1D-CNNs) from the omni-scale for time-series classification tasks and provided a stronger baseline. Therefore, this research explores some machine learning methods to predict future traffic signals as time-series data. The LSTM, Baseline Model, Linear Model, Dense Model, and Convolutional Neural Network are applied and compared for traffic signal forecasting in this work.
The rest of this paper is organized as follows.
Section 2 introduces the ways in which the collected data are processed, as well as the basic structure of the researched machine learning models.
Section 3 describes forecasting results and makes further analysis on the test accuracy and basic metrics for different time horizons.
Section 4 discusses the results and provides future research direction.