1. Introduction
Traffic growth and limited available capacity within the roadway system produces problems and challenges for transportation agencies. Traffic congestion affects traveler mobility and has an impact on air quality, and consequently on public health. The stopping and starting in traffic jams burns fuel at a higher rate than the smooth rate of travel, and contributes to the amount of emissions released by vehicles that create air pollution and are related to global warming [
1]. Reduction in traffic congestion improves traveler mobility and accessibility, while also reducing vehicle fuel consumption and emissions.
Traffic congestion in 2013 cost Americans
billion [
2], and this number is projected to rise to
billion in 2030. Traffic signal controllers attempt to optimize various traffic variables (e.g., delay, queue length, and energy and emission levels), by optimizing signal control variables, including the cycle length, the phasing scheme and sequence, the phase split, and the offset. Most of the currently implemented traffic signal systems can be categorized into one of the following categories: fixed-time control (FP), actuated control (ACT), responsive control, or adaptive control [
3].
An FP control system is developed off-line using historical traffic data to compute traffic signal timings; real-time traffic data is not taken into account, and the duration and order of all phases stay fixed without any adaptation to real-time traffic demand fluctuations [
4]. Previous studies have found this approach to only be appropriate for under-saturated conditions and traffic flows that are stable or relatively stable [
5]. By comparison, ACT systems respond to changes in traffic demand patterns by communicating with the controller based on the presence or absence of vehicles as identified by local detectors installed at intersection approach stop lines. While ACT has been proven to generally perform better than FP for very low demand levels, it still offers no real-time optimization to adapt to traffic fluctuations, and may result in long network queues [
6]. Adaptive systems have the potential to alleviate traffic congestion by adjusting signal timing parameters in response to real-time traffic fluctuations. These systems use detector inputs, historical trends, and predictive models to predict vehicle arrivals at intersections, and then use the predictions to determine the best gradual changes in cycle length, phase splits, and offsets to minimize vehicle delays or queue lengths [
7]. Some examples in this category are: the Split Cycle Offset Optimization Tool (SCOOT) [
8], a macroscopic model that minimizes delay and the number of vehicle stops at all intersection approaches, and performs effectively in under-saturated traffic conditions. The Sydney Coordinated Adaptive Traffic System (SCATS) [
9] operates in a centralized hierarchical mode, and allocates green times to the phases of greatest need. OPAC [
10] optimizes an objective function for a specified rolling horizon using dynamic-programming-based traffic prediction models that require a traffic environment state transition probability model, which can be difficult to generate. TR2 and UTCS-1 [
11], optimized off-line, are incapable of handling stochastic variations in traffic patterns.
The operation of actuated and adaptive controllers is constrained by minimum and maximum cycle lengths, green indication durations and offsets, and also require going through a pre-defined sequence of phases. In addition, some systems use hierarchies that either partially or totally centralize decisions, rendering them more susceptible to failures. Hierarchies make scaling these systems up more difficult, relatively more complex to operate, and more expensive [
13].
Various computational intelligence-based techniques have been investigated in the domain of traffic signal optimization domain, and are still under continuous research and development, using fuzzy sets, genetic algorithms, reinforcement learning, and neural networks. Genetic algorithms compute the optimal solution using an evolutionary process of possible solutions [
13,
14]; it solves simple networks and deals with static traffic volumes. However, as the network increases in size, the search space involved in finding effective signal plans increases significantly, and a large amount of centralized computing power is required. Pappis [
15] proposed the first signal controller using fuzzy logic for an isolated intersection. Ella [
16] proposed a neuro-fuzzy controller, where the parameters of the fuzzy membership functions were adjusted using a neural network. The neural learning algorithm in Ella’s work was reinforcement learning, which was found to be successful at constant traffic volumes, but failed when the traffic demand changed rapidly. The choice of the membership functions (building blocks of fuzzy set theory) are important for a particular problem since they affect a fuzzy inference system. As a traffic control system is a complex large-scale system with many interactive factors, it is more appropriate to use fuzzy control for isolated intersections [
17].
Several approaches have been proposed for designing traffic signal controllers using neural networks [
18,
19]. Most of these works are based on a distributed approach, where an agent is assigned to update the traffic signals of a single intersection. Neural networks also adapt very slowly to changing traffic parameters, where on-line learning has to take place continuously. Some networks require multiple models to be maintained for various times within a day. Most intelligence-based approaches are still being researched and are thus under development or have only been implemented and tested on an isolated intersection, so their effectiveness for controlling a large-scale traffic network is also unknown.
Reinforcement learning is inspired by behavioral psychology [
20]. It is a machine learning approach which allows agents to interact with the environment, attempting to learn the optimal behavior based on the feedback received from interactions. The feedback may be available right after the action, or several time steps later, which makes the learning more challenging [
21]. Abdulhai et al. [
22] applied a model-free Q-learning technique to a simple two-phase isolated traffic signal in a two-dimensional road network. Salkham et al. [
23] applied a Q-learning strategy that allowed an agent to exchange rewards with its neighbors on 64 signalized intersections. The state-action space was simple and very time coarse. Each agent decided the phase splits every two cycles, which did not capture of the rapid dynamics of congestion–coordination between the agents actions was missing. Studies have considered the use of RL algorithms for traffic control, but they are very limited in terms of network complexity and traffic loadings, so that realistic scenarios, over saturated conditions, and transitions from under saturation to over saturation (and vice versa) have not been fully explored.
Game theory studies the interactive cooperation between intelligent rational decision makers with the specific goal of cooperating and benefiting from reaching a mutually agreeable outcome. It has been widely used in economic, military, communication applications [
24,
25], model traveler route choice behavior [
26], control connected vehicle movements [
27], and to in-route guidance [
28]. The literature indicates that investigation of game-theoretic traffic signal control is very limited. Bargaining theory is related to cooperative games through the concept of Nash bargaining (NB). A bargaining situation is defined as a situation in which multiple players with specific objectives cooperate and benefit by reaching a mutually agreeable outcome [
29]. The bargaining process is the procedure that bargainers follow to reach an agreement (outcome) [
30], and the bargaining outcome is the result of the bargaining process [
31,
32].
Traffic flow is affected by a number of factors, including weather, time-of-day, day-of-week, and unpredictable events, such as special events, incidents, and work zones. Consequently, traffic control strategies could be improved if control systems responded not only to actual conditions, but also adapted their actions to transient conditions. Due to the stochastic nature of traffic flows, an adaptive control strategy that adjusts to stochastic changes is needed. Cycle-free strategies may present an innovative and less restrictive means of accommodating variations in traffic conditions.
Traffic signal controllers can be categorized as centralized or decentralized. Centralized systems require a reliable and direct communication network between a central computer and the local controllers. The main advantage of these systems is that they allow for traffic signal coordination. However, decentralized systems offer many advantages over centralized control systems as they are computationally less demanding and require only relevant information from adjacent intersections/controllers. Robustness is also guaranteed in decentralized control systems, because if one or more controllers fail, the remaining controllers can take over some of their tasks. Decentralized systems are scalable and easy to expand by inserting new controllers into the system. Additionally, decentralized systems are often inexpensive to establish and operate, as there is no essential need for a reliable and direct communication network between a central computer and the local controllers in the field.
To mitigate traffic congestion, a novel
de-centralized traffic signal controller, considering a flexible phasing sequence and cycle-free operation, using a
NB game-theoretic framework (DNB) is developed. The proposed controller was implemented and evaluated in the INTEGRATION microscopic traffic assignment and simulation software [
33,
34,
35]. INTEGRATION is a microsopic model that replicates vehicle longitudinal motion using the Rakha–Pasumarthy–Adjerid collision-free car-following model, also known as the RPA model [
36]. The RPA model captures vehicle steady-state car-following behavior using the Van Aerde model [
37,
38]. Movement from one steady state to another is constrained by a vehicle dynamics model described in [
39,
40]. Vehicle lateral motion is modeled using lane-changing models described in [
35]. The model estimates of vehicle delay were validated in [
41], while vehicle stop estimation procedures are described and validated in [
42]. Vehicle fuel consumption and emissions are modeled using the VT-Micro model [
43,
44,
45]. The developed controller was compared to the operation of a decentralized
phase
split and
cycle length controller (PSC) [
6], and a fully coordinated adaptive
phase
split-
cycle length and
offset optimization controller (PSCO) to evaluate its performance, where PSCO is based on the REALTRAN (REAL-time TRANsyt) controller that emulates the SCOOT system [
46,
47]. The DNB controller was implemented and evaluated on large-scale networks consisting of 38 (Blacksburg) and 457 (downtown Los Angeles) signalized intersections.
This paper describes the application and the testing of the proposed DNB controller on large-scale networks and is organized as follows.
Section 2 describes the developed de-centralized traffic signal controller using a game-theoretic framework.
Section 3 presents the experimental setup and results of a large-scale study in the town of Blacksburg, Virginia, consisting of 38 signalized intersections.
Section 4 describes the experimental setup and the experimental results of a large-scale study on a downtown network in Los Angeles, California, consisting of 457 signalized intersections.
Section 5 presents a summary and conclusions drawn from these studies.
5. Summary & Conclusions
The research presented in this paper develops and evaluates a Nash bargaining de-centralized flexible phasing cycle-free traffic signal controller (DNB controller) on large-scale networks. The controller was implemented and tested in the INTEGRATION microscopic traffic assignment and simulation software. The performance of the DNB controller was compared to a decentralized phase split and cycle length optimization controller based on the HCM procedures (PSC) and a fully-coordinated adaptive phase split, cycle length and offset optimization controller (PSCO), in the town of Blacksburg, Virginia and in downtown Los Angeles, California.
Several simulations were conducted on the Blacksburg network using different threat point values and phasing schemes to determine their effect on the controller’s performance. The results show significant reductions in the network-wide average travel time of and , a reduction in the average total delay of and , a reduction in the stopped delay of and , and a reduction in emission levels of and , over the PSC and PSCO controllers, respectively. In addition, the results show significant reductions on the intersection approach average travel time of , a reduction in the average queue length of , a reduction in the average number of vehicle stops of , a reduction in the fuel consumption of , a reduction in the emissions of , and a reduction in emissions of .
In addition, the DNB controller’s performance was tested in downtown Los Angeles, California, and compared to the performance of the de-centralized PSC controller. The results show significant improvements in various network-wide measures of performance. Specifically, a reduction in the average travel time of , a reduction in the average total delay of , a reduction in the stopped delay of , a reduction in the average number of vehicle stops of , and a reduction in emissions of , over the PSC controller. Moreover, the results show significant improvements in the signalized intersection operations with a reduction in the average travel time of , a reduction in the average queue length of , a reduction in the average number of vehicle stops of , a reduction in the fuel consumption and emissions of , and a reduction in emissions of . Furthermore, simulations conducted for lower traffic demand levels showed significant network-wide improvements with a reduction in the average total delay of , a reduction in the stopped delay of , and a reduction in the average number of stops of over the PSC controller. As these results indicate, the DNB controller can generate major performance improvements at lower demands. The results demonstrate significant potential benefits of using the proposed controller over other state-of-the-art centralized and de-centralized controllers on large scale networks.
In summary, a novel traffic signal controller is developed that offers a number of unique features. First, the controller adapts signal timings dynamically to changing traffic conditions without using historical data, which tends to be inaccurate, resulting in inefficient traffic signal plans. Second, the developed controller is de-centralized, which increases both the scalability and robustness of the system, to avoid the problems inherent with complex centralized communication. Decentralized systems are often inexpensive to establish and operate, as there is no essential need for a reliable and direct communication network between a central computer and the local controllers in the field. Third, the controller, while de-centralized, does not sacrifice in system-wide performance and computes the network-wide Nash optimum solution. Finally, the controller is designed to operate with current traffic signal controllers. This controller should increase the traffic handling capacity of roads, and reduce unnecessary stop-and-go vehicular movement, which will reduce fuel consumption and, accordingly, air pollution.