Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review

Agrahari, Anurag; Dhabu, Meera M.; Deshpande, Parag S.; Tiwari, Ashish; Baig, Mogal Aftab; Sawarkar, Ankush D.

doi:10.3390/electronics13193875

Open AccessReview

Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review

by

Anurag Agrahari

^1,*,

Meera M. Dhabu

¹,

Parag S. Deshpande

¹,

Ashish Tiwari

¹,

Mogal Aftab Baig

¹ and

Ankush D. Sawarkar

²

¹

Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur 440010, India

²

Department of Information Technology, Shri Guru Gobind Singhji Institute of Engineering and Technology (SGGSIET), Nanded 431606, India

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(19), 3875; https://doi.org/10.3390/electronics13193875

Submission received: 16 August 2024 / Revised: 19 September 2024 / Accepted: 24 September 2024 / Published: 30 September 2024

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

The exponential increase in vehicles, quick urbanization, and rising demand for transportation are straining the world’s road infrastructure today. To have a sustainable transportation system with dynamic traffic volume, an Adaptive Traffic Signal Control system (ATSC) should be contemplated to reduce urban traffic congestion and, thus, help reduce the carbon footprints/emissions of greenhouse gases. With dynamic cleave, the ATSC system can adapt the signal timing settings in real-time according to seasonal and short-term variations in traffic demand, enhancing the effectiveness of traffic operations on urban road networks. This paper provides a comprehensive study on the insights, technical lineaments, and status of various research work in ATSC. In this paper, the ATSC is categorized based on several road intersections (RIs), viz., single-intersection (SI) and multiple-intersection (MI) techniques, viz., Fuzzy Logic (FL), Metaheuristic (MH), Dynamic Programming (DP), Reinforcement Learning (RL), Deep Reinforcement Learning (DRL), and hybrids used for developing Traffic Signal Control (TSC) systems. The findings from this review demonstrate that modern ATSC systems designed using various techniques offer substantial improvements in managing the dynamic density of the traffic flow. There is still a lot of scope to research by increasing the number of RIs while designing the ATSC system to suit real-life applications.

Keywords:

traffic signal control; intelligent transportation system; microsimulation tools; fuzzy logic; reinforcement learning; dynamic programming

1. Introduction

Controlling and managing traffic signals at Road Intersections (RIs) to ensure vehicular traffic safety and a constant traffic flow is difficult for any dense urban area. An RI is where two or more roads converge, diverge, meet, or cross at the same height. In urban areas, an RI acts as a hindrance or obstacle to the smooth flow of traffic, which is dynamic with time [1]. The excessive number of vehicles transitioning from one place to another and the absence of proper flow management causes an interruption in the smooth traffic flow, resulting in lousy traffic congestion and inappropriate traffic control [2]. According to [3], fixed traffic signal (FTS) latencies account for about 10% of worldwide traffic delays. During the initial days of urbanization, many government agencies realized the need for traffic control at RIs and deployed old-age manual traffic signalized systems. In the manual Traffic Signal Control (TSC) system [4], the traffic officers modify the duration of the yellow, red, and green signals according to the traffic volume. This mode aims to achieve a smooth flow of traffic and safe passage for road commuters at RIs. Today, traffic light synchronization controls traffic at major RIs by providing the same amount of green signal time to all directions during the one cycle [5,6] known as FTS [7,8]. Various papers [9,10] suggested that developed countries suffer over 295 million traffic hours of delay with FTS. Thus, detailed estimations of time-dependent delays are needed for better traffic control and management at RIs. Congestion at an RI results from the complex traffic system architecture’s inability to coordinate or correlate the timing of traffic lights with the traffic volume, which is dynamic with the season and time of day [11]. The bulk of the working population residing in the nearby areas causes the traffic flow in many urban cities [12] to be uneven and unbalanced, evident during morning and evening hours. For smooth and safe traffic, urban areas must install a TSC system [13,14] that alters the duration of traffic lights and synchronizes them at RIs utilizing real-time traffic information. A balance between secure and efficient traffic management is necessary [15]. An Intelligent Transportation System (ITS) [14,16] is the solution to handle the challenging problem of traffic control and management at RIs, known as the Adaptive Traffic Signal Control (ATSC) system, which automatically adjusts the durations of traffic lights for each direction based on current traffic circumstances to improve traffic flow [17]. This helps to reduce delays and assure safety. The ATSC system maximizes traffic flow by continuously adjusting the green split durations in response to real-time traffic patterns and the expected arrivals from nearby intersections [18]. Systematically advancing vehicles through green signals significantly reduces travel times. Additionally, smoother flow eases traffic congestion. ATSC helps increase economic productivity [19] by reducing the time lost in traffic congestion, wastage of fuel, and the effect of pollution, which is always a staggering issue at the world level. One of the parameters needed while designing an ATSC system is the number of RIs considered for experimental modeling and evaluation. Because of the high degree of interdependency, the complexity of the problem domain increases, specifically in managing the traffic flow.

The search term and total count of the publications related to traffic management-related keywords used in this study are shown in Figure 1.

The main contributions of this paper can be summarized as follows:

Investigated the ATSC system based on the number of RIs, viz., SI and MI.
Investigated various techniques used to design an ATSC system, viz., Fuzzy Logic (FL), Metaheuristic (MH), Dynamic Programming (DP), Reinforcement Learning (RL), Deep Reinforcement Learning (DRL), and Hybrid techniques.
Further, these techniques are investigated based on a (a) single-intersection environment (ATSC for non-cooperative, which is isolated/islanded in the non-cooperative environment) and (b) multiple-intersection environment (ATSC with a cooperative environment).
Evaluates gaps in knowledge and unaddressed challenges for ATSC with possible future research direction.

The organization of this paper is as follows: Section 2 presents the preliminaries related to traffic signal control; Section 3 presents the literature work performed in the domain of an ATSC system discussing system model distinctiveness and uncontemplated issues of the proposed solutions. Section 4 covers comparative studies, analysis, and experiments on various ATSC systems, and Section 5 discusses the conclusion and future research work.

2. Preliminaries

This section offers an overview of the foundational concept behind TSC systems, the terminology and metrics used for their evaluation, the different microsimulation tools, and their key features.

2.1. Traffic Signal Control System

To regulate traffic and pedestrian safety efficiently at RIs, we need a well-designed TSC system for the urban region to accommodate dynamic traffic density; adaptive TSC systems are preferred. Traffic signal timing (TST) for each direction is vital in controlling RIs. By prioritizing transportation, one can also enhance the capacity of an RI with well-timed signals.

2.2. Traffic Signal Timing

At the controlled RI, the traffic lights change between red, yellow, and green. TST helps determine the time duration and sequencing of signals for each direction to ensure smooth traffic flows.

Time Parameters for TST: By allocating the right-of-way at an RI in the appropriate direction, TST settings [20,21] serve the primary objective of facilitating quick and safe passage through intersections. A few of the TST configuration parameters must be flexible enough to accommodate changes in dynamic traffic demand, while others ought to be under the control of the traffic management system. These command parameters are as follows:

Signal Phase: A signal phase is a period during which a specific movement, like vehicles or pedestrians, is given the right of way at an intersection, controlled by traffic lights. It is part of the traffic signal cycle that manages traffic flow [4].
Signal Cycle: Refers to one complete rotation in which each traffic direction movement is signaling to facilitate safe passage for the traffic at a specific road intersection.
Signal Sequence: The order in which signals are phased during a signal cycle.
Cycle Length: The number of seconds needed for a signal to go through one full signal cycle.
Green Time: The number of seconds when a certain traffic flow at an RI continues at a maximum flow rate for a particular traffic lane.
Timing of Phases: The amount of time in seconds taken by a particular phase for a given direction during a specific single complete signal cycle [4].
Change Interval: After the green movement period, yellow duration is offered.
Offset: The order of the synchronized phases’ temporal relationships. Extending the green phase duration for a specific movement can decrease the number of stopped vehicles and the latency. On the other hand, a rise in the green period of one traffic flow typically causes an increase in the delay and the number of vehicles lining up for competitors’ traffic lane flows. Therefore, a proper traffic signal strategy allots time to maximize the overall traffic results, such as the average waiting time.
Average Waiting Time: This is how long it takes a vehicle to leave a road intersection after it stops at the intersection.

2.3. Microsimulation Tools

Analytical models help evaluate a system’s performance in general, but simulation tools are essential for scrutinizing various use cases for dynamic traffic analysis. Simulation is described as the replication of real-world applications to obtain knowledge more easily via models such as traffic flow models. These models help to explain the physical dispersion of traffic flow. For thoroughly examining the urban transportation system in a secure and appropriate setting, modeling traffic is indispensable [22]. Overall, traffic simulation tools can be broken down into two categories: microscopic and macroscopic. This review study focuses primarily on microsimulation-based simulators.

2.3.1. Microscopic

The microscopic simulation method considers the driver’s actions and interactions with those of other motorists and pedestrians. CORSIM, AIMSUN, Paramics, MATSim, VISSIM, and SUMO are some of the most popular microsimulation tools for investigating various dynamic challenges in urban traffic. This work considers a wide variety of researches that use microsimulation tools in some other way (either as an evaluation or as a model component). As a result, the authors consider it appropriate to address the parameters (vehicle direction, vehicle ID, route ID, vehicle speed, change vehicle route, etc.) of a few frequently used microsimulation tools for the readers.

Nedal and Syed [23] analyzed the features and traits of several frequently used traffic simulation packages and offered a comparative study by focusing on a few unique features. AIMSUN, VISSIM, and CORSIM were deemed suitable for modeling arterial and state highway congestion and a coordinated network of major highways and streets. In contrast, AIMSUN, CORSIM, and PARAMICS were considered ideal for ITS [8]. MATSim, on the other hand, offers a framework for implementing large-scale agent-based transport simulations. We classify and evaluate the six frequently used microsimulation tools based on the following characteristics.

Open Source;
System
-
Discrete control system (DC)
-
Continuous control systems (CC);
Visualization (2D/3D);
Scope of Application
-
Regional (R)
-
City (C)
-
Country (Co);
Output file format;
Capability of importing maps;
Programming language supported;
Level of programming proficiency required.

SUMO, VISSIM, AIMSUM, MATSIM, CORSIM, and Paramics are studied on the above features, and a summary is presented in Table 1. Researchers used different programming languages (PL) to build simulation tools such as SUMO, VISSIM, etc. The architecture of all these simulators is different, and as per the tool’s framework, the PL level of coding changes. In some simulations, such as SUMO, the level of coding is difficult, whereas in VISSIM, it is easy.

2.3.2. Macroscopic

A mathematical traffic model called a macroscopic traffic flow framework [24] establishes relationships between various aspects of traffic flow, such as the dynamic of traffic density flow, the average speed of a traffic stream, etc. Typically, these models are built by merging microscopic traffic flow simulations and transforming an individual entity’s properties into those of a similar system [25].

3. Review of Previous Research

Today, with a rising number of vehicles, there is an urgent need for an efficient TSC system for urban traffic management, allowing a safer and more efficient traffic flow at every RI [15,26]. The ATSC system has undergone numerous improvements since ATSC was introduced to address the challenges faced by TSC [19]. These challenges include insufficient transportation infrastructure, rising vehicles, atmospheric conditions, growing infrastructure, public events, modernization of traffic network layout, etc. Each reason can generate traffic congestion at any time. Solving the challenging, complicated, and nonlinear stochastic problem [27] of alleviating traffic congestion induced by these factors is the responsibility of both researchers and engineers [28,29].

In the following sections, we examine the studies conducted on ATSC systems, which can be broadly categorized into two, viz., single-intersection ATSC (SI-ATSC) and multiple-intersection ATSC (MI-ATSC), as shown in Figure 2a and Figure 2b, respectively. Figure 3 illustrates the comprehensive technique employed in TSC systems.

Each following subsection contains a comprehensive study that summarizes the research work performed using a particular technique based on the following parameters:

Number of road intersections considered (SI/MI);
Objective;
Methods/parameters used;
Control system strategy;
Data source;
Microsimulation tool used.

3.1. Single-Intersection ATSC (SI-ATSC)

Much research has been performed on the ATSC system, and SI-ATSC was considered while designing the experimental models. Various techniques [30,31] were used to design SI-ATSC systems. The following subsections present the categorization and assessment of the research work, which used microsimulation tools in their studies for designing the ATSC system. The limitation of SI-ATSC is the long queue formation and the increased average waiting time for vehicles at the adjacent RI.

3.1.1. SI-ATSC Using Reinforcement Learning: (SI-ATSC-RL)

Kaige Wen et al. [32] implemented a stochastic ATSC system using RL, which can provide an appropriate control policy to keep the traffic network from getting overcrowded. The standard intersection traffic model is expanded to a new mode that considers various real-world elements of traffic situations, such as the turning fraction and lane design. Lu Shoufeng et al. [33] evaluated the performance of Q-Learning for ATSC. Action space and phase green time change are explored in this study. This can be mathematically shown, as in Equation (1).

\begin{matrix} Q t (s_{t}, a_{t}) = Q t (s_{t}, a_{t}) + α (r_{t + 1} + γ . m a x_{B} Q t (s_{t + 1}, a_{t} - Q t (s_{t}, a_{t}))) \end{matrix}

(1)

where

Q t (s_{t}, a_{t})

is the value of an action

a_{t}

taken from state

s_{t}

,

Q t (s_{t + 1}, a_{t})

is the immediate future Q-value,

s_{t + 1}

is the next state,

m a x_{B}

is selected among all possible action sets

a_{t}

in state

s_{t + 1}

’s maximum, and

r_{t + 1}

is the reward received after taking action

a_{t}

in state

s_{t}

. This strategy benefits from reducing the average vehicle delay by 11 min/h or 18.3%. The limitation is the length of the action space, which requires additional study.

El-Tantawy et al. [34] used a stochastic closed-loop optimal control problem to represent an ATSC system that was successfully solved using RL. For this simulation environment, a generic RL control system is created and simulated for a multi-phase TSC for a remote RI in Downtown Toronto, which reduced the queue length by 28%, average latency by 27%, and

C O_{2}

emission factors by 28%. Investigations are ongoing to determine the best sensor module to improve the performance further. Another limitation of this approach is that it does not consider transit vehicles. Sahand et al. [35] implemented Deep Q Network (DQN) and Inverse Reinforcement Learning (IRL) to extract the rewards. One of the challenges of this approach is vast state spaces. Results of the approach of [35] in simulation-based autonomous driving scenarios are similar to the perceptive relationship between the reward function and the data generated by distance sensors mounted at different positions on the vehicle. They also demonstrate that their simulated agent makes accident motions and exhibits human-like lane change behavior after a few learning rounds. The general approach to IRL is the approach’s limitation. More advanced techniques, such as Maximum Entropy IRL and support for nonlinear reward functions, are currently being investigated. Juntao Gao et al. [36] used a DRL to develop the ideal strategy for ATSC by automatically extracting all useful information from raw real-time traffic data. They used target networks and experience replay to increase algorithm stability.

Touhbi et al. [37] examined RL’s viability, focusing on applying Q-Learning to adaptable environments. An RL-based control system was created under various traffic dynamics for an isolated multi-phase intersection. The techniques used in [37] are revolutionary, as the authors used a newly created generalized state space with separate known reward concepts. The main advantage of this strategy is that the reward concepts revealed that a reward function’s effectiveness depends on the traffic density at the RI. The technique proposed in [37] needs further evaluation of the impact of various parameters, such as the number of phases, lanes at each artery, etc., and the RI architecture. Genders et al. [38] used neural network function approximation to model three-state depictions with a range of resolutions to assess the effectiveness of the A3C algorithm. The limitation of this approach is the need for high-resolution state data and the development of rich representations. Liu et al. [39] investigated the capability of DQN to optimize TSC policies and demonstrated that the DQN algorithms produce the “threshold” approach in a SI scenario. This can be mathematically shown as in Equation (2).

\begin{matrix} y_{p}^{D Q N} = r_{p} + γ_{a_{p + 1}} m a x Q^{\prod} (s_{p + 1}, a_{p + 1}, Θ_{p}^{-}) \end{matrix}

(2)

where

Q^{\prod} (s_{p + 1}, a_{p + 1}, Θ_{p}^{-})

refers to the target network.

The main benefit of this method is the appearance of intelligent behavior, like “green wave” formations, which reveal its capacity to acquire desired structural elements. This technique’s main drawback is that it requires more capability to examine locality properties and how to use them in creating distributed coordination schemes for large-scale deployment scenarios. Garg et al. [40] examined the issue of traffic congestion at RIs. They designed a traffic simulator to model numerous traffic conditions closely linked to real-world traffic circumstances as accurately as possible and suggested using a policy gradient algorithm-based DRL method to build traffic light management policies. The main benefit of this technique is that it can handle unexpected changes in traffic flow, density, and circumstances as well as accidents that cause bottlenecks. The technique has to be further examined for use in multi-lane, complicated RIs. Chin et al. [41] proposed a Q-Learning method to handle the traffic light timing schedule more effectively. Q-Learning reaps the rewards from previous experiences, including future action to learn from experience and decide the best possible choices. Liang et al. [42] studied how to choose the length of a traffic light based on the information gathered from various sensors and proposed a DRL model [43] for TSC. To boost performance, the suggested model integrates numerous optimization aspects, including a dueling network, a target network, a Double DQN, and a prioritized experience replay. The summary of SI-ATSC using RL is presented in Table 2.

3.1.2. SI-ATSC Using Metaheuristic: (SI-ATSC-MH)

Metaheuristic strategies seek the best ranges for numerous signal timings that affect the performance of traffic signals, such as cycle time, green duration, phase sequence, offsets, change period, and so on.

Zaharia et al. [46] addressed the issues of traffic flow control in urban areas by generating a TSO strategy as a solution. The suggested method employed the Particle Swarm Optimization (PSO) technique for traffic light cycle programming. This can be mathematically shown, as in Equations (3) and (4), respectively.

\begin{matrix} X_{h + 1}^{i} = x_{h}^{i} + v_{h + 1}^{i} \end{matrix}

(3)

\begin{matrix} V_{h + 1}^{i} = w V_{h}^{i} + U [0, Φ 1] (p b e s t_{h}^{i} - x_{h}^{i}) + U [0, Φ 2]] (h b e s t_{h} - x_{h}^{i}) \end{matrix}

(4)

where

V_{h + 1}^{i}

is the particle velocity,

p b e s t_{h}^{i}

is the particular finest answer,

(h b e s t_{h}

is the global best particle in entire swarm, w is the weight of particle,

Φ 1

and

Φ 2

are the acceleration coefficients, and

U [0, Φ n]

is the uniform random value.

The drawback of this strategy is that it cannot be extended to RIs with more traffic lights than what they considered, as they evaluated a fixed number of lights at an RI.

To tackle the optimal traffic signal configuration problem, a bi-level optimization framework was proposed by Li. Z. [47] in the year (2017). The upper-level agent generates traffic signal configurations to reduce drivers’ average trip time, whereas the lower-level agent seeks to achieve network stability using the upper-level settings. The genetic algorithm (GA) combines dynamic traffic assignment to dissociate the complicated bi-level challenge into tractable, sequentially solvable, single-level problems. This approach’s limitation is that it needs to consider acyclic signal patterns, better suited for situations where traffic conditions rapidly change with a specific direction having very sparse traffic over a brief period.

Yu et al. [48] proposed an optimal TSC methodology for determining the signal control parameters that minimize total trip time; it used two signal control methodologies: fixed-timing scheduling and ATSC. A heuristic GA is proposed to meet the suggested non-linear programming task with time-varying delay terms. With the interrupted traffic flow, it is challenging for the proposed method to define the adaptive user equilibrium accurately.

Jia et al. [49] proposed a multi-objective TSO framework with objectives such as per-person delay, car emissions, and road intersection capacity. Considering the target problem’s characteristics, an MH algorithm is designed to connect difference operators based on the Particle Swarm Optimization (PSO) technique. This strategy can significantly improve the traffic conditions in urban RIs. The disadvantage of this strategy is that mixed traffic scenarios, such as pedestrians, transit vehicles, etc. are not considered. The summary of the presented work is shown in Table 3.

3.1.3. SI-ATSC Using Fuzzy Logic: (SI-ATSC-FL)

Many researchers developed an ATSC using FL techniques to manage the signal timings, as FL controllers can efficiently handle multilingual and irregular traffic data. Dexin et al. [52] proposed a method utilizing Fuzzy Logic Programming (FLP) to optimize signal timing at an isolated RI. Considering the intersection’s overall operational efficiencies, vehicle cycle delay, traffic capacity, cycle stops, and exhaust pollution are initially selected as optimization targets to construct a multi-objective function. Then, for different traffic flow ratio states, an FLP technique [53,54] is used to assign different weight factors to various optimization objectives. The disadvantage of this strategy is that it neglects to consider more situations in terms of different categories of drivers, vehicles, and road intersections.

Aksa et al. [55] proposed a real-time traffic simulator with an adaptive fuzzy inference technique for scheduling the anticipated light signal time. It modifies the signal duration based on the number of vehicles waiting behind the RI’s green and red signals. Given a scenario, it automatically generates traffic flows based on the parameters supplied. Following that, the obtained results were analyzed in the simulated environment. The proposed technique needs further evaluation of the impact of various parameters, such as types of drivers, vehicles, and RIs.

Vogel et al. [56] provided an FL-ATSC for enhancing traffic flow at an isolated RI. Using the collected data from road detectors (arrival flow, queue length, and departure flow), a set of fuzzifications was developed to determine whether the next phase should be reduced or extended. Further investigation regarding the shortcomings of this strategy and potential applications of the GA for the fuzzy rule set optimization needs to be performed.

3.2. Multiple Intersection ATSC (MI-ATSC)

In the MI-ATSC system, as mentioned before, the number of intersections included while modeling the ATSC system impacts the complexity of optimizing traffic flow control due to the interdependency between them, where, to reduce travel time, synchronized signal timings are required between the RIs. The optimization solution for MI-ATSC needs to synchronize carefully so that a vehicle shares a standard cycle length—that is, the time (in seconds) it takes for a traffic signal to serve all indications at the RI. With ATSC systems, cycle lengths frequently change throughout the day for peak and off-peak traffic conditions [57]. The fact that all clocks at each coordinated intersection agree on the time is critical to synchronization. The advantages of this approach are fewer queued vehicles at an RI and less average waiting time for each vehicle, reducing greenhouse gas emissions and helping the environment.

3.2.1. MI-ATSC Using Reinforcement Learning: (MI-ATSC-RL)

In [58], the authors proposed a dynamic phasing sequence in acyclic signal control with Q-Learning. The authors created three models with different state depictions to study the optimum state model for various traffic scenarios. The models were evaluated on a common multiphase intersection to reduce vehicle delays and benchmarked against the pre-timed control method. One of the limitations of this approach is that the time complexity of calculating the attention mechanism can be reduced, allowing the agent to react to the neighborhood’s role more quickly and lowering the response time. Additional external data, such as weather information, information on daytime illumination, temperature, and humidity, can further enhance performance.

LA et al. [59] proposed an RL method with function approximation. The suggested approach integrates state-action characteristics and is simple to apply in high-dimensional environments. The advantage of their approach is that, unlike earlier RL-based work, it does not require specific data on queue sizes and time intervals at each lane but instead works with the previously given characteristics.

El-Tantawy et al. [60] presented an MARL technique for an integrated network of ATSC (MARLIN-ATSC). It provided two different operating methods: i. Using an independent technique, each RI controller acts independently from other agents at nearby intersections. ii. Using an integrated technique, each RI controller communicates signal control operations with other road intersections. During morning rush hour, MARLIN-ATSC is evaluated on an extensive network simulation of fifty-nine RIs with traffic signals in Toronto’s lower urban area, Canada. The results demonstrated an extraordinary reduction in the average intersection wait time, by 27% in one (i.e., independent technique) and by 39% in two (i.e., integrated technique), as well as 15% in independent technique and 26% in integrated travel time savings along the busiest routes in metropolitan Toronto. One of the limitations of this technique is the application of other RL algorithms with FA. The effects of driver behavior [61] are incorporated into these approaches as a second limitation.

Abdoos et al. [62] proposed a Q-Learning-based two-level hierarchical ATSC. TSC at a particular RI can be viewed as an initial-level autonomous agent (on the lowest rung of the hierarchy) that learns a control policy using Q-Learning. At the second level, the network is split into regions, with an agent allocated to each area (at the top of the hierarchy). It utilized a grid of three-by-three junctions with nine RIs. The technique’s limitations are utilizing various tiling configurations and applying them to more extensive real networks.

Aslani et al. [63] developed an actor–critic ATSC (A-CATs) system that addressed the following scenarios: (a) Performance comparisons of continuous and discrete A-CATs controllers in an ongoing congested traffic network (24-hour demand for traffic) in Tehran’s upper downtown. (b) How various traffic interruptions, such as opportunistic pedestrian crossings, parking lanes, infrequent traffic jams, and varying degrees of sensor noise disturbance, affect the effectiveness of A-CATS. (c) The effectiveness of various function approximators (radial basis function and tile coding) on A-CATs controllers’ learning was examined. The technique’s drawback is creating a weather-sensitive TSC system. Driver behavior is impacted by weather conditions, which alter driving speeds, headways, and reaction times.

Aziz et al. [64] used an RL algorithm based on the R-Markov Average Reward Technique, specifically RMART, to solve a vehicular signal management problem by utilizing information-sharing across signal controllers in a connected vehicle [65] context. The complexity of the method needs to be thoroughly tested using various parameters such as types of drivers and types of vehicles.

Li et al. [66] introduced a Multi-Agent Reinforcement Learning (MARL) method named KS- DDPG for achieving optimal control by improving traffic signal cooperation. By implementing a communication protocol that facilitates knowledge-sharing, each agent will have access to a collective depiction of the traffic environment compiled by all agents. Two experiments utilizing synthetic and real-world datasets were conducted to test the suggested technique. The drawback of this approach is that every agent must communicate throughout the modeling process, which limits the effectiveness of communication as a whole.

Lin et al. [67], Salah et al. [68], and R. et al. [69], developed a DRL-based system that integrates multiple techniques to master an appropriate control strategy in a short time. The suggested algorithm relaxes the assumption of a fixed traffic demand design and eliminates the need for human intervention in parameter adjustment. This method’s key advantage is its ability to handle more complicated MI control problems while consuming less computational power. The challenge of this approach is converting the state into a suitable format for the broader sense of an unorganized traffic network.

Abdoos et al. [70] suggested a hierarchical multi-agent system with two tiers to control traffic lights. A first-level agent is in charge of overseeing each traffic light. The traffic network is divided into regions for the different levels, and a region controller agent controls each. Initial-level agents use RL to determine the optimal strategy while sending their local data to higher-level agents. The method’s extension to even more than two levels is one of this technique’s limitations.

Zhang et al. [71] suggested a DRL-based ATSC technique combined with the Dueling Double Deep Q Network (D3QN) framework. The proposed method seeks to balance the goals of safety, efficiency, and decarbonization with optimizing traffic signals at RIs. The suggested technique is evaluated using a simulated RI in Changsha, China. The suggested ATSC algorithm reduces traffic conflicts by over 16% and carbon emissions by 4%, outperforming standard and efficiency-optimized ATSC techniques. The suggested approach outperforms in high-traffic-demand circumstances, satisfying all three objectives. The proposed strategy is limited to a single RI during peak hours and may not apply to other RIs.

Wang et al. [72] proposed a Multi-layer Graph Mask Q-Learning (MLGMQL) framework for multi-intersection (MI) Adaptive Traffic Signal Control (ATSC) aimed at optimizing traffic and reducing delays. The framework utilizes an updated GraphSAGE algorithm and graph attention mechanisms to model traffic into two layers: the upper-level network-layer graphs and lower-level intersection-layer graphs. This structure allows the system to adapt to traffic conditions and road networks. However, one major limitation arises in intersections with more than four directions, as the model simplifies them into four-way intersections, potentially leading to issues like lane misalignment and vehicle confusion.

Zhou et al. [73] introduced a Multi-Agent Incentive Communication Deep Reinforcement Learning (MICDRL) approach for coordinating multi-intersection traffic control. Agents in the system generate personalized messages that influence neighboring agents’ policies, improving overall coordination and leading to globally optimized traffic control decisions. A notable feature of this method is its reliance on local information for message generation, which reduces communication overhead while maintaining effective teamwork. However, a limitation of the approach is the lack of consideration for pedestrian and vehicle classifications, which need to be integrated for more comprehensive control.

Wu et al. [74] developed a multi-agent framework for large-scale traffic control using game-theory-supported Reinforcement Learning strategies like Nash-A2C and Nash-A3C. These algorithms are implemented within a distributed IoT architecture, particularly in fog layers, to enhance scalability. Their evaluations demonstrated improvements over traditional traffic signal control systems, achieving a 22.1% reduction in network latency and a 9.7% decrease in congestion time. Despite these advances, further research is required to optimize the system with a lightweight Multi-Agent Reinforcement Learning (MARL) model for more efficient large-scale deployment. The summary of MI-ATSC using RL is presented in Table 4.

3.2.2. MI-ATSC Using Metaheuristic: (MI-ATSC-MH)

Hajbabaie et al. [82] developed a program for simultaneous network signal timing optimization and traffic assignment within urban transportation networks. The method integrates a genetic algorithm (GA) to enhance signal timing settings and simultaneously solve system-optimal traffic assignments, considering oversaturated traffic conditions and diverse driver behaviors. The model also incorporates meta-heuristic optimization [82] for improved performance. A significant limitation of this method is the assumption that drivers cannot dynamically change their routes without access to en-route information, which restricts its adaptability in real-time scenarios.

Dakic et al. [83] proposed two Traffic Signal Control (TSC) algorithms, initial backpressure and modified backpressure, to improve communication network performance. The goal was to maximize urban traffic throughput by managing signal operations efficiently. However, the system lacks adaptability to dynamic changes in traffic, especially in highly fluctuating urban environments. Elgarej et al. [50], introduced a Distributed Ant Colony Optimization (ACO) approach for optimizing Traffic Signal Timing (TST). The primary goal was to manage real-time intersection control using a decentralized ACO algorithm to reduce congestion by adjusting signal durations based on real-time input data. While effective, the approach could benefit from further improvement in handling high-density traffic with more complex patterns. Nguyen et al. [84] proposed a multi-objective evolutionary algorithm (MOEA) called NSGA-II-LS, integrating local search techniques to optimize traffic control behaviors. The approach’s effectiveness was tested in oversaturated traffic conditions, demonstrating improvements over traditional methods. However, more work is needed to incorporate additional traffic features into the optimization process for better real-time adaptability.

Wardrop’s User Equilibrium (UE) theory [85] outlines a scenario where drivers select the fastest routes based on known traffic conditions, ultimately leading to network balance. This theory assumes that congestion will naturally balance out when traffic conditions are widely known; however, in real-world situations, the unpredictable nature of traffic flow can still lead to inefficiencies.

Guo et al. [86] developed a genetic algorithm-based optimization model aimed at area-wide signal timing in user-equilibrium traffic scenarios. The model minimizes journey time by optimizing the relationship between trip time and distance, offering a robust solution for managing urban road intersections under heavy traffic. However, like other models, its real-time performance could be enhanced by integrating dynamic traffic behavior and external factors. The summary of the presented work is shown in Table 5.

3.2.3. Hybrid Approach for (MI-ATSC-Hybrid)

Gao et al. [88] studied the issue of urban traffic signal scheduling policy (UTLSP) using a centralized model within a scheduling framework. The suggested model needs to incorporate the concepts of splits, cycles, and offsets, which places UTLSP in model-based optimization issues. The network controller assigns each traffic signal in real time. In a given finite window, the goal is to reduce the overall network delay time. To address the UTLSP, a swarm intelligent approach called discrete harmony search (DHS) is developed. Based on the characteristics of UTLSP, three local search operators with different components are proposed to increase the efficiency of DHS in local space. In [89,90], the authors suggested combining several local search techniques to incorporate various neighborhood structures. In [83,91,92,93], the mathematical method is combined with a microsimulation tool. The techniques used were Backpressure (BP), Dynamic Programming (DP), and optimal control policy. In [60,63], two signal control techniques that rely on the BP model were proposed to improve throughput in urban areas. Backpressure was changed after initializing these various models. According to the results, the suggested algorithms surpassed the fixed-time scheduling and actuated control methods. Hatri et al. [87] suggested an intelligent multi-objective PSO (MOPSO)-based multi-objective TSC system. They used average wait time and traffic flow on the congested road as their two goals. Using the proposed method, they granted each swarm agent the capacity to select optimal MOPSO parameters using a multi-objective Q-Learning strategy. Gao et al. [94] examined a TSS issue in a transportation network comprising signalized and unsignalized RIs. Within a specific time interval, the objective was to minimize the sum of all vehicle delays across the network. First, a unique model for describing a traffic network with signalized and unsignalized RIs was proposed. Next, five meta-heuristics models (GA, ABC, HS, Jaya algorithm, and WCA) were used to address the TSS issue. Jiang et al. [95] and Storani et al. [96] researched traffic light schedule optimization and flow prediction, respectively. In this study, the authors suggested an urban TSC system based on traffic flow foresight by combining these two methodologies. First, an urban TSC system design incorporating signal control optimization and traffic flow forecasting is provided. Second, a flexible traffic light scheduling method is intended to reduce traffic. The objective is to reduce the number of vehicles blocked at all signalized RIs. The presented works are summarized in Table 6. The statistics regarding the number of studies conducted in the field using ATSC and various techniques are shown in Figure 4.

4. Discussion

A discussion of this review is as follows:

In developing countries, the demand for transport infrastructure is increasing exponentially. However, government agencies cannot provide this in a short time-frame, emphasizing the urgent need for an efficient traffic control system to tackle the demand. ATSC is one solution for this.
In contrast to traditional models, the ATSC system considers the traffic patterns and vehicle movements and responds to these factors in real-time [101,102]. It enables day-to-day operations by utilizing dynamic traffic flow and providing a user-friendly interface to its commuters. TSC system optimization is a challenging and intricate topic to solve. Stochastic processes are often included due to the unpredictable nature of traffic flow demand and behaviors [103]. The solution space for practical issues is so enormous that finding optimal solutions is challenging.
By decreasing the time lost in heavy traffic, fuel waste, and the effects of pollution, which are always alarming on a global scale, ATSC will aid in boosting economic output. As a result, there is a greater need for improved traffic management technologies. To facilitate a smoother traffic flow, ATSC dynamically modifies the timing of traffic signals [104] based on current traffic circumstances.
The findings from this review demonstrate that modern ATSC systems designed using various techniques offer substantial improvements in managing the dynamic density of the traffic flow.
When applied to single intersections (SI) and multiple intersections (MI), ATSC systems have proven to reduce travel time, vehicle idling, and emissions, contributing to a smoother traffic flow. A key observation from the studies is that the complexity of traffic signal optimization increases significantly in MI scenarios due to the interdependency between intersections. In contrast, SI systems are easier to manage but may result in bottlenecks in adjacent areas due to isolated decision-making. Techniques like Reinforcement Learning (RL), Deep Reinforcement Learning (DRL), Fuzzy logic (FL), Dynamic programming (DP), metaheuristic technique (MH), and hybrid methods have shown promising results in dynamically adjusting signal timings based on real-time traffic data. Despite the advancements, gaps remain, particularly in applying ATSC systems in complex, real-world traffic environments with mixed vehicle types and pedestrian interactions.
A few SI-ATSC systems utilizing RL techniques, such as SARSA, Q-Learning, and TD error [34,38,44], have shown the advantage of reducing vehicle queue lengths by up to 19%. However, these approaches are limited by the restricted number of available action spaces. Future research needs to increase ample action space by utilizing NNs.
Several SI-ATSC systems employ MH techniques, such as PSO, heuristic GA, and PSO [46,48,49], which are beneficial for addressing the suggested non-linear programming tasks involving time-varying delay terms. However, a critical challenge with these methods in interrupted traffic flow is their difficulty in accurately defining the adaptive user equilibrium. The scalability of the methods is a crucial aspect that should be addressed in future work.
Several SI-ATSC systems utilize FL techniques, such as FLP and fuzzy inference [52,55], which offer the advantage of scheduling anticipated signal times by adjusting the signal duration based on the number of vehicles at the intersection. However, these strategies’ limitations do not account for factors such as driver categories, vehicle types, and road infrastructures. Further research using these techniques will include driver categories and vehicle types.
The MI-ATSC systems employing RL techniques, such as FA, MARL, Q-Learning, and two-level hierarchical DRL (D3QN) [59,60,62,70,71], aim to optimize traffic signals by balancing safety, efficiency, and decarbonization goals. These techniques excel under high-traffic-demand conditions, successfully addressing all three objectives. However, a key limitation is that each agent must communicate throughout the modeling process, which can reduce the overall communication efficiency.
Some MI-ATSC systems utilize MH techniques, such as PSO, GA, ACO, SA, and CS [46,82,83]. These techniques aim to manage road infrastructure (RI) in real-time through decentralized algorithms, reducing congestion by adjusting signal durations and collecting input data from the execution environment. Future research should incorporate traffic characteristics in oversaturated conditions into the optimization process.
The review also highlights the continued dominance of fixed-time traffic control systems in simulation-based studies, with fewer real-time dynamic systems being explored. This indicates a need for more research and experimentation with real-time adaptive systems, especially in developing countries with prevalent infrastructure challenges. Integrating data from external sources such as weather, pedestrian activity, and unexpected road events has been identified as a critical factor in enhancing ATSC performance. As microsimulation tools like SUMO and VISSIM become more sophisticated, they offer greater opportunities to fine-tune and test these systems under various traffic conditions.

5. Conclusions and Future Work

This paper comprehensively examines the various AI-driven techniques employed in developing ATSC systems and their application to single- (SI) and multiple-intersection (MI) environments. Advanced techniques like RL, DRL, and hybrid approaches have demonstrated significant improvements in traffic management, particularly in reducing travel times and emissions. However, most research has focused on single intersections, leaving a considerable research gap in optimizing multiple intersections, where coordination between signals becomes critical.

Future research should address these challenges by exploring multi-agent systems that can handle the complexity of MI environments while accounting for real-time, fluctuating traffic conditions. Additionally, incorporating diverse real-world factors like pedestrian movement, weather conditions, and emergency scenarios will be vital for improving the effectiveness and applicability of ATSC systems in urban settings. With continuous advancements in AI and real-time data collection, there is great potential to revolutionize traffic management systems and reduce traffic congestion’s economic and environmental impact globally.

Author Contributions

Conceptualization, A.A. and M.M.D.; methodology, A.A.; supervision, M.M.D.; review, M.M.D.; co-supervision, P.S.D.; writing—review and editing, A.T. and M.A.B.; visualizations, review, and editing, A.D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Acknowledgments

The authors would like to thank the Director, Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India for providing the necessary facilities for this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AC	Actor–Critic
Ac	Actuated Control
ACO	Ant Colony Optimization
ATSC	Adaptive Traffic Signal Control
ABC	Artificial bee colony
ATT	Average travel time
BC	Bee Colony Algorithm
Bi	Binary
BP	Back Pressure
CS	Cuckoo Search
CWT	Cumulative waiting time
Cycle L	Cycle length
DP	Dynamic Programming
DRL	Deep Reinforcement Learning
FITS	Fuzzy Intelligence Transportation System
FL	Fuzzy Logic
FLATSC	Fuzzy Logic Adaptive Traffic Signal Control
GA	Genetic Algorithm
Gn	Green time
HS	Harmony Search
JADE	Java agent development Environment
KS-DDPG	Knowledge Sharing Deep Deterministic Policy Gradient
MARL	Multi Agent Reinforcement Learning
MBRL	Model-Based RL
MH	Metaheuristic Algorithm
MI	Multiple Intersection
MOL	Multi-objective Learning
MOLAC	Multi-objective Learning Agent Cooperation
Nash-A2C	Nash Advantage Actor Critic
Nash-A3C	Nash Asynchronous Advantage Actor Critic
NN	Neural Network
NSGA	Non-dominated Sorting Genetic
PL	Programming Language;
PSO:	Particle Swarm Optimization
QL	Queue Learning
Ql	Queue length
QLAC	Queue Learning Actor–Critic
RBF	Radial basis Function
RL	Reinforcement Learning
SA	Sarsa Algorithm
SAFA	Sarsa Algorithm with Function approximation
SI	Single Intersection
SUMO	Simulation of Urban Mobility
TSC	Traffic Signal Control
TSS	Traffic Signal Scheduling
TST	Traffic Signal Timing
TSO	Traffic signal optimization
VISSIM	Verkehr In Stadten–SIMulationmodell
WCA	Water Cycle Algorithm

References

Neelakandan, S.; Berlin, M.A.; Tripathi, S.; Devi, V.B.; Bhardwaj, I.; Arulkumar, N. IoT-Based Traffic Prediction and Traffic Signal Control System for Smart City. Soft Comput. 2021, 25, 12241–12248. [Google Scholar] [CrossRef]
Nielsen, O.A.; Frederiksen, R.D.; Simonsen, N. Using Expert System Rules to Establish Data for Intersections and Turns in Road Networks. Int. Trans. Oper. Res. 1998, 5, 569–581. [Google Scholar] [CrossRef]
Jing, P.; Huang, H.; Chen, L. An Adaptive Traffic Signal Control in a Connected Vehicle Environment: A Systematic Review. Information 2017, 8, 101. [Google Scholar] [CrossRef]
Kim, M.; Schrader, M.; Yoon, H.-S.; Bittle, J.A. Optimal Traffic Signal Control Using Priority Metric Based on Real-Time Measured Traffic Information. Sustainability 2023, 15, 7637. [Google Scholar] [CrossRef]
Zaghal, R.; Thabatah, K.; Salah, S. Towards a Smart Intersection Using Traffic Load Balancing Algorithm. In Proceedings of the 2017 Computing Conference, London, UK, 18–20 July 2017; pp. 485–491. [Google Scholar]
Mishra, S.; Singh, V.; Gupta, A.; Bhattacharya, D.; Mudgal, A. Adaptive Traffic Signal Control for Developing Countries Using Fused Parameters Derived from Crowd-Source Data. Transp. Lett. 2023, 15, 296–307. [Google Scholar] [CrossRef]
Jovanović, A.; Teodorović, D. Pre-Timed Control for an under-Saturated and over-Saturated Isolated Intersection: A Bee Colony Optimization Approach. Transp. Plan. Technol. 2017, 40, 556–576. [Google Scholar] [CrossRef]
Ahmed, E.K.E.; Khalifa, A.M.A.; Kheiri, A. Evolutionary Computation for Static Traffic Light Cycle Optimisation. In Proceedings of the 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), Khartoum, Sudan, 12–14 August 2018; pp. 1–6. [Google Scholar]
Noaeen, M.; Naik, A.; Goodman, L.; Crebo, J.; Abrar, T.; Abad, Z.S.H.; Bazzan, A.L.C.; Far, B. Reinforcement Learning in Urban Network Traffic Signal Control: A Systematic Literature Review. Expert Syst. Appl. 2022, 99, 116830. [Google Scholar] [CrossRef]
Tian, Y.; Liu, S.; Yan, X.; Zhu, T.; Zhang, Y. Active Control Method of Traffic Signal Based on Parallel Control Theory. IEEE J. Radio Freq. Identif. 2024, 8, 334–340. [Google Scholar] [CrossRef]
Li, D.; Zhu, F.; Wu, J.; Wong, Y.D.; Chen, T. Managing Mixed Traffic at Signalized Intersections: An Adaptive Signal Control and CAV Coordination System Based on Deep Reinforcement Learning. Expert Syst. Appl. 2024, 238, 121959. [Google Scholar] [CrossRef]
Sawarkar, A.D.; Shrimankar, D.D.; Ali, S.; Agrahari, A.; Singh, L. Bamboo Plant Classification Using Deep Transfer Learning with a Majority Multiclass Voting Algorithm. Appl. Sci. 2024, 14, 1023. [Google Scholar] [CrossRef]
Haydari, A.; Yılmaz, Y. Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11–32. [Google Scholar] [CrossRef]
Agarwal, A.; Sahu, D.; Nautiyal, A.; Gupta, M.; Agarwal, P. Fusing Crowdsourced Data to an Adaptive Wireless Traffic Signal Control System Architecture. Internet Things 2024, 26, 101169. [Google Scholar] [CrossRef]
Chen, L.; Englund, C. Cooperative Intersection Management: A Survey. IEEE Trans. Intell. Transp. Syst. 2016, 17, 570–586. [Google Scholar] [CrossRef]
Anirudh, R.; Krishnan, M.; Kekuda, A. Intelligent Traffic Control System Using Deep Reinforcement Learning. In Proceedings of the 2022 International Conference on Innovative Trends in Information Technology (ICITIIT), Kottayam, India, 12–13 February 2022; pp. 1–8. [Google Scholar]
Saleem, M.; Abbas, S.; Ghazal, T.M.; Adnan Khan, M.; Sahawneh, N.; Ahmad, M. Smart Cities: Fusion-Based Intelligent Traffic Congestion Control System for Vehicular Networks Using Machine Learning Techniques. Egypt. Inform. J. 2022, 23, 417–426. [Google Scholar] [CrossRef]
Liu, B.; Ding, Z. A Distributed Deep Reinforcement Learning Method for Traffic Light Control. Neurocomputing 2022, 490, 390–399. [Google Scholar] [CrossRef]
Zhao, P.; Gao, Y.; Sun, X. How Does Artificial Intelligence Affect Green Economic Growth?—Evidence from China. Sci. Total Environ. 2022, 834, 155306. [Google Scholar] [CrossRef]
Tajalli, M.; Hajbabaie, A. Traffic Signal Timing and Trajectory Optimization in a Mixed Autonomy Traffic Stream. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6525–6538. [Google Scholar] [CrossRef]
Majstorović, Ž.; Tišljarić, L.; Ivanjko, E.; Carić, T. Urban Traffic Signal Control under Mixed Traffic Flows: Literature Review. Appl. Sci. 2023, 13, 4484. [Google Scholar] [CrossRef]
Kang, L.; Lu, W.; Liu, L. Research on Route Hierarchical Control Strategy from the Perspective of Macroscopic Traffic Network. J. Intell. Transp. Syst. 2022, 27, 818–833. [Google Scholar] [CrossRef]
Ratrout, N.T.; Rahman, S.M. A Comparative Analysis of Currently Used Microscopic and Macroscopic Traffic. Science 2009, 34, 121–133. [Google Scholar]
Chevallier, E.; Leclercq, L. A Macroscopic Theory for Unsignalized Intersections. Transp. Res. Part B Methodol. 2007, 41, 1139–1150. [Google Scholar] [CrossRef]
Gökçe, M.A.; Öner, E.; Işık, G. Traffic Signal Optimization with Particle Swarm Optimization for Signalized Roundabouts. Simulation 2015, 91, 456–466. [Google Scholar] [CrossRef]
Ahmed, M.A.A.; Khoo, H.L.; Ng, O.-E. Discharge Control Policy Based on Density and Speed for Deep Q-Learning Adaptive Traffic Signal. Transp. B Transp. Dyn. 2023, 11, 1707–1726. [Google Scholar] [CrossRef]
Tsitsokas, D.; Kouvelas, A.; Geroliminis, N. Two-Layer Adaptive Signal Control Framework for Large-Scale Dynamically-Congested Networks: Combining Efficient Max Pressure with Perimeter Control. Transp. Res. Part C Emerg. Technol. 2023, 152, 104128. [Google Scholar] [CrossRef]
Zhao, D.; Dai, Y.; Zhang, Z. Computational Intelligence in Urban Traffic Signal Control: A Survey. IEEE Trans. Syst. Man, Cybern. Part C Appl. Rev. 2012, 42, 485–494. [Google Scholar] [CrossRef]
Kolat, M.; Kővári, B.; Bécsi, T.; Aradi, S. Multi-Agent Reinforcement Learning for Traffic Signal Control: A Cooperative Approach. Sustainability 2023, 15, 3479. [Google Scholar] [CrossRef]
Kang, D.; Li, Z.; Levin, M.W. Evasion Planning for Autonomous Intersection Control Based on an Optimized Conflict Point Control Formulation. J. Transp. Saf. Secur. 2022, 14, 2074–2110. [Google Scholar] [CrossRef]
Levin, M.W.; Rey, D. Conflict-Point Formulation of Intersection Control for Autonomous Vehicles. Transp. Res. Part C Emerg. Technol. 2017, 85, 528–547. [Google Scholar] [CrossRef]
Kaige, W.; Shiru, Q.; Yumei, Z. A Stochastic Adaptive Control Model for Isolated Intersections. In Proceedings of the 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 15–18 December 2007; pp. 2256–2260. [Google Scholar] [CrossRef]
Shoufeng, L.; Ximin, L.; Shiqiang, D. Q-Learning for Adaptive Traffic Signal Control Based on Delay Minimization Strategy. In Proceedings of the 2008 IEEE International Conference on Networking, Sensing and Control, ICNSC, Sanya, China, 6–8 April 2008; pp. 687–691. [Google Scholar] [CrossRef]
El-Tantawy, S.; Abdulhai, B.; Abdelgawad, H. Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control. J. Intell. Transp. Syst. 2014, 18, 227–245. [Google Scholar] [CrossRef]
Sharifzadeh, S.; Chiotellis, I.; Triebel, R.; Cremers, D. Learning to Drive Using Inverse Reinforcement Learning and Deep Q-Networks. arXiv 2016, arXiv:1612.03653. [Google Scholar]
Gao, J.; Shen, Y.; Liu, J.; Ito, M.; Shiratori, N. Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network. arXiv 2017, arXiv:1705.02755. [Google Scholar]
Touhbi, S.; Babram, M.A.; Nguyen-Huu, T.; Marilleau, N.; Hbid, M.L.; Cambier, C.; Stinckwich, S. Adaptive Traffic Signal Control: Exploring Reward Definition for Reinforcement Learning. Procedia Comput. Sci. 2017, 109, 513–520. [Google Scholar] [CrossRef]
Genders, W.; Razavi, S. Evaluating Reinforcement Learning State Representations for Adaptive Traffic Signal Control. Procedia Comput. Sci. 2018, 130, 26–33. [Google Scholar] [CrossRef]
Wang, H.; Chen, H.; Wu, Q.; Ma, C.; Li, Y. Multi-Intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline. IEEE Open J. Intell. Transp. Syst. 2022, 3, 126–136. [Google Scholar] [CrossRef]
Garg, D.; Chli, M.; Vogiatzis, G. Deep Reinforcement Learning for Autonomous Traffic Light Control. In Proceedings of the 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), Singapore, 3–5 September 2018; pp. 214–218. [Google Scholar]
Chin, Y.K.; Lee, L.K.; Bolong, N.; Yang, S.S.; Teo, K.T.K. Exploring Q-Learning Optimization in Traffic Signal Timing Plan Management. In Proceedings of the 2011 Third International Conference on Computational Intelligence, Communication Systems and Networks, Bali, Indonesia, 26–28 July 2011; pp. 269–274. [Google Scholar]
Liang, X.; Du, X.; Wang, G.; Han, Z. A Deep Reinforcement Learning Network for Traffic Light Cycle Control. IEEE Trans. Veh. Technol. 2019, 68, 1243–1253. [Google Scholar] [CrossRef]
Tang, D.; Duan, Y. Traffic Signal Control Optimization Based on Neural Network in the Framework of Model Predictive Control. Actuators 2024, 13, 251. [Google Scholar] [CrossRef]
Thorpe, T.L.; Anderson, C.W. Traffic Light Control Using SARSA with Three State Representations; IBM Corp.: Armonk, NY, USA, 1996. [Google Scholar]
El-Tantawy, S.; Abdulhai, B. An Agent-Based Learning towards Decentralized and Coordinated Traffic Signal Control. In Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems, Funchal, Portugal, 19–22 September 2010; pp. 665–670. [Google Scholar] [CrossRef]
Panovski, D.; Zaharia, T. Simulation-Based Vehicular Traffic Lights Optimization. In Proceedings of the 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 28 November–1 December 2016; pp. 258–265. [Google Scholar]
Li, Z.; Shahidehpour, M.; Bahramirad, S.; Khodaei, A. Optimizing Traffic Signal Settings in Smart Cities. IEEE Trans. Smart Grid 2017, 8, 2382–2393. [Google Scholar] [CrossRef]
Yu, H.; Ma, R.; Zhang, H.M. Optimal Traffic Signal Control under Dynamic User Equilibrium and Link Constraints in a General Network. Transp. Res. Part B Methodol. 2018, 110, 302–325. [Google Scholar] [CrossRef]
Jia, H.; Lin, Y.; Luo, Q.; Li, Y.; Miao, H. Multi-Objective Optimization of Urban Road Intersection Signal Timing Based on Particle Swarm Optimization Algorithm. Adv. Mech. Eng. 2019, 11, 1687814019842498. [Google Scholar] [CrossRef]
Elgarej, M.; Khalifa, M.; Youssfi, M. Traffic Lights Optimization with Distributed Ant Colony Optimization Based on Multi-Agent System BT—Networked Systems; Abdulla, P.A., Delporte-Gallet, C., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 266–279. [Google Scholar]
Chuo, H.S.E.; Tan, M.K.; Chong, A.C.H.; Chin, R.K.Y.; Teo, K.T.K. Evolvable Traffic Signal Control for Intersection Congestion Alleviation with Enhanced Particle Swarm Optimisation. In Proceedings of the 2017 IEEE 2nd International Conference on Automatic Control and Intelligent Systems (I2CACIS), Kota Kinabalu, Malaysia, 21–21 October 2017; pp. 92–97. [Google Scholar]
Yu, D.; Tian, X.; Xing, X.; Gao, S. Signal Timing Optimization Based on Fuzzy Compromise Programming for Isolated Signalized Intersection. Math. Probl. Eng. 2016, 2016, 1682394. [Google Scholar] [CrossRef]
Jin, J.; Ma, X.; Kosonen, I. An Intelligent Control System for Traffic Lights with Simulation-Based Evaluation. Control Eng. Pract. 2017, 58, 24–33. [Google Scholar] [CrossRef]
Tunc, I.; Soylemez, M.T. Fuzzy Logic and Deep Q Learning Based Control for Traffic Lights. Alexandria Eng. J. 2023, 67, 343–359. [Google Scholar] [CrossRef]
Aksaç, A.; Uzun, E.; Özyer, T. A Real Time Traffic Simulator Utilizing an Adaptive Fuzzy Inference Mechanism by Tuning Fuzzy Parameters. Appl. Intell. 2012, 36, 698–720. [Google Scholar] [CrossRef]
Vogel, A.; Oremović, I.; Šimić, R.; Ivanjko, E. Improving Traffic Light Control by Means of Fuzzy Logic. In Proceedings of the 2018 International Symposium ELMAR, Zadar, Croatia, 16–19 September 2018; pp. 51–56. [Google Scholar] [CrossRef]
Ilgin Guler, S.; Menendez, M.; Meier, L. Using Connected Vehicle Technology to Improve the Efficiency of Intersections. Transp. Res. Part C Emerg. Technol. 2014, 46, 121–131. [Google Scholar] [CrossRef]
Su, G.; Yang, J.J. Enhancing the Robustness of Traffic Signal Control with StageLight: A Multiscale Learning Approach. Eng 2024, 5, 104–115. [Google Scholar] [CrossRef]
Prashanth, L.A.; Bhatnagar, S.; Member, S. Approximation for Traffic Signal Control. IEEE Trans. Intell. Transp. Syst. 2011, 12, 412–421. [Google Scholar]
El-Tantawy, S.; Abdulhai, B.; Abdelgawad, H. Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto. IEEE Trans. Intell. Transp. Syst. 2013, 14, 1140–1150. [Google Scholar] [CrossRef]
Zeinaly, Z.; Sojoodi, M.; Bolouki, S. A Resilient Intelligent Traffic Signal Control Scheme for Accident Scenario at Intersections via Deep Reinforcement Learning. Sustainability 2023, 15, 1329. [Google Scholar] [CrossRef]
Abdoos, M.; Mozayani, N.; Bazzan, A.L.C. Hierarchical Control of Traffic Signals Using Q-Learning with Tile Coding. Appl. Intell. 2014, 40, 201–213. [Google Scholar] [CrossRef]
Aslani, M.; Mesgari, M.S.; Wiering, M. Adaptive Traffic Signal Control with Actor-Critic Methods in a Real-World Traffic Network with Different Traffic Disruption Events. Transp. Res. Part C Emerg. Technol. 2017, 85, 732–752. [Google Scholar] [CrossRef]
Aziz, H.M.A.; Zhu, F.; Ukkusuri, S.V. Learning-Based Traffic Signal Control Algorithms with Neighborhood Information Sharing: An Application for Sustainable Mobility. J. Intell. Transp. Syst. Technol. Plan. Oper. 2018, 22, 40–52. [Google Scholar] [CrossRef]
Haddad, T.A.; Hedjazi, D.; Aouag, S. A Deep Reinforcement Learning-Based Cooperative Approach for Multi-Intersection Traffic Signal Control. Eng. Appl. Artif. Intell. 2022, 114, 105019. [Google Scholar] [CrossRef]
Li, Z.; Yu, H.; Zhang, G.; Dong, S.; Xu, C.-Z. Network-Wide Traffic Signal Control Optimization Using a Multi-Agent Deep Reinforcement Learning. Transp. Res. Part C Emerg. Technol. 2021, 125, 103059. [Google Scholar] [CrossRef]
Lin, Y.; Dai, X.; Li, L.; Wang, F.-Y. An Efficient Deep Reinforcement Learning Model for Urban Traffic Control. arXiv 2018, arXiv:1808.01876. [Google Scholar]
Bouktif, S.; Cheniki, A.; Ouni, A.; El-Sayed, H. Deep Reinforcement Learning for Traffic Signal Control with Consistent State and Reward Design Approach. Knowl.-Based Syst. 2023, 267, 110440. [Google Scholar] [CrossRef]
Kumar, R.; Sharma, N.V.K.; Chaurasiya, V.K. Adaptive Traffic Light Control Using Deep Reinforcement Learning Technique. Multimed. Tools Appl. 2024, 83, 13851–13872. [Google Scholar] [CrossRef]
Abdoos, M.; Bazzan, A.L.C. Hierarchical Traffic Signal Optimization Using Reinforcement Learning and Traffic Prediction with Long-Short Term Memory. Expert Syst. Appl. 2021, 171, 114580. [Google Scholar] [CrossRef]
Zhang, G.; Chang, F.; Jin, J.; Yang, F.; Huang, H. Multi-Objective Deep Reinforcement Learning Approach for Adaptive Traffic Signal Control System with Concurrent Optimization of Safety, Efficiency, and Decarbonization at Intersections. Accid. Anal. Prev. 2024, 199, 107451. [Google Scholar] [CrossRef]
Wang, T.; Zhu, Z.; Zhang, J.; Tian, J.; Zhang, W. A Large-Scale Traffic Signal Control Algorithm Based on Multi-Layer Graph Deep Reinforcement Learning. Transp. Res. Part C Emerg. Technol. 2024, 162, 104582. [Google Scholar] [CrossRef]
Zhou, B.; Zhou, Q.; Hu, S.; Ma, D.; Jin, S.; Lee, D.-H. Cooperative Traffic Signal Control Using a Distributed Agent-Based Deep Reinforcement Learning With Incentive Communication. IEEE Trans. Intell. Transp. Syst. 2024, 25, 10147–10160. [Google Scholar] [CrossRef]
Wu, Q.; Wu, J.; Shen, J.; Du, B.; Telikani, A.; Fahmideh, M.; Liang, C. Distributed Agent-Based Deep Reinforcement Learning for Large Scale Traffic Signal Control. Knowl.-Based Syst. 2022, 241, 108304. [Google Scholar] [CrossRef]
Steingröver, M.; Schouten, R.; Peelen, S.; Nijhuis, E.; Bakker, B. Reinforcement Learning of Traffic Light Controllers Adapting to Traffic Congestion. Belgian/Netherlands Artif. Intell. Conf. 2005, 216–223. [Google Scholar]
Prabuchandran, K.J.; AN, H.K.; Bhatnagar, S. Multi-Agent Reinforcement Learning for Traffic Signal Control. In Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), Qingdao, China, 8–11 October 2014; pp. 2529–2534. [Google Scholar]
Khamis, M.A.; Gomaa, W. Enhanced Multiagent Multi-Objective Reinforcement Learning for Urban Traffic Light Control. In Proceedings of the 2012 11th International Conference on Machine Learning and Applications, Boca Raton, FL, USA, 12–15 December 2012; Volume 1, pp. 586–591. [Google Scholar]
Khamis, M.A.; Gomaa, W. Adaptive Multi-Objective Reinforcement Learning with Hybrid Exploration for Traffic Signal Control Based on Cooperative Multi-Agent Framework. Eng. Appl. Artif. Intell. 2014, 29, 134–151. [Google Scholar] [CrossRef]
Prashanth, L.A.; Bhatnagar, S. Reinforcement Learning with Average Cost for Adaptive Control of Traffic Lights at Intersections. In Proceedings of the 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC, USA, 5–7 October 2011; pp. 1640–1645. [Google Scholar]
Houli, D.; Zhiheng, L.; Yi, Z. Multiobjective Reinforcement Learning for Traffic Signal Control Using Vehicular Ad Hoc Network. EURASIP J. Adv. Signal Process. 2010, 2010, 1–7. [Google Scholar] [CrossRef]
Jin, J.; Ma, X. A Multi-Objective Agent-Based Control Approach With Application in Intelligent Traffic Signal System. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3900–3912. [Google Scholar] [CrossRef]
Hajbabaie, A.; Benekohal, R.F. A Program for Simultaneous Network Signal Timing Optimization and Traffic Assignment. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2573–2586. [Google Scholar] [CrossRef]
Dakic, I.; Stevanovic, J.; Stevanovic, A. Backpressure Traffic Control Algorithms in Field-like Signal Operations. In Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems, Gran Canaria, Spain, 15–18 September 2015; pp. 137–142. [Google Scholar]
Nguyen, P.T.M.; Passow, B.N.; Yang, Y. Improving Anytime Behavior for Traffic Signal Control Optimization Based on NSGA-II and Local Search. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 4611–4618. [Google Scholar]
Wardrop, J.G. Road Paper. Some Theoretical Aspect of Road Traffic Research. Proc. Inst. Civ. Eng. 1952, 1, 325–362. [Google Scholar] [CrossRef]
Guo, J.; Kong, Y.; Li, Z.; Huang, W.; Cao, J.; Wei, Y. A Model and Genetic Algorithm for Area-Wide Intersection Signal Optimization under User Equilibrium Traffic. Math. Comput. Simul. 2019, 155, 92–104. [Google Scholar] [CrossRef]
El Hatri, C.; Boumhidi, J. Q-Learning Based Intelligent Multi-Objective Particle Swarm Optimization of Light Control for Traffic Urban Congestion Management. In Proceedings of the 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt), Tangier, Morocco, 24–26 October 2016; pp. 794–799. [Google Scholar]
Gao, K.; Zhang, Y.; Sadollah, A.; Su, R. Optimizing Urban Traffic Light Scheduling Problem Using Harmony Search with Ensemble of Local Search. Appl. Soft Comput. 2016, 48, 359–372. [Google Scholar] [CrossRef]
Srivastava, S.; Sahana, S.K. Nested Hybrid Evolutionary Model for Traffic Signal Optimization. Appl. Intell. 2017, 46, 113–123. [Google Scholar] [CrossRef]
Massow, K.; Pfeifer, N.; Ketzler, F.; Radusch, I. Close-Range Coordination to Enhance Constant Distance Spacing Policies in Oversaturated Traffic Systems. Sensors 2024, 24, 4865. [Google Scholar] [CrossRef]
Chen, S.; Sun, D.J. An Improved Adaptive Signal Control Method for Isolated Signalized Intersection Based on Dynamic Programming. IEEE Intell. Transp. Syst. Mag. 2016, 8, 4–14. [Google Scholar] [CrossRef]
Lu, K.; Jiang, S.; Xin, W.; Zhang, J.; He, K. Algebraic Method of Regional Green Wave Coordinated Control. J. Intell. Transp. Syst. 2022, 27, 799–817. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, W.; Liu, Y.; Xiong, G. Mean Field Multi-Agent Reinforcement Learning Method for Area Traffic Signal Control. Electronics 2023, 12, 4686. [Google Scholar] [CrossRef]
Gao, K.; Zhang, Y.; Su, R.; Yang, F.; Suganthan, P.N.; Zhou, M. Solving Traffic Signal Scheduling Problems in Heterogeneous Traffic Network by Using Meta-Heuristics. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3272–3282. [Google Scholar] [CrossRef]
Jiang, C.-Y.; Hu, X.-M.; Chen, W.-N. An Urban Traffic Signal Control System Based on Traffic Flow Prediction. In Proceedings of the 2021 13th International Conference on Advanced Computational Intelligence (ICACI), Wanzhou, China, 14–16 May 2021; pp. 259–265. [Google Scholar]
Storani, F.; Di Pace, R.; De Schutter, B. A Traffic Responsive Control Framework for Signalized Junctions Based on Hybrid Traffic Flow Representation. J. Intell. Transp. Syst. 2022, 27, 606–625. [Google Scholar] [CrossRef]
Manandhar, B.; Joshi, B. Adaptive Traffic Light Control with Statistical Multiplexing Technique and Particle Swarm Optimization in Smart Cities. In Proceedings of the 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS), Kathmandu, Nepal, 25–27 October 2018; pp. 210–217. [Google Scholar]
Bernas, M.; Płaczek, B.; Smyła, J. A Neuroevolutionary Approach to Controlling Traffic Signals Based on Data from Sensor Network. Sensors 2019, 19, 1776. [Google Scholar] [CrossRef] [PubMed]
Bie, Y.; Cheng, S.; Liu, Z. Optimization of Signal-Timing Parameters for the Intersection with Hook Turns. Transport 2017, 32, 233–241. [Google Scholar] [CrossRef]
Tarek, Z.; AL-Rahmawy, M.; Tolba, A. Fog Computing for Optimized Traffic Control Strategy. J. Intell. Fuzzy Syst. 2019, 36, 1401–1415. [Google Scholar] [CrossRef]
Xu, H.; Zhang, N.; Li, Z.; Zhuo, Z.; Zhang, Y.; Zhang, Y.; Ding, H. Energy-Saving Speed Planning for Electric Vehicles Based on RHRL in Car Following Scenarios. Sustainability 2023, 15, 15947. [Google Scholar] [CrossRef]
Zhao, Z.; Wang, K.; Wang, Y.; Liang, X. Enhancing Traffic Signal Control with Composite Deep Intelligence. Expert Syst. Appl. 2024, 244, 123020. [Google Scholar] [CrossRef]
Mok, K.; Zhang, L. Adaptive Traffic Signal Management Method Combining Deep Learning and Simulation. Multimed. Tools Appl. 2022, 83, 15439–15459. [Google Scholar] [CrossRef]
Li, T.; Guo, F.; Krishnan, R.; Sivakumar, A. An Analysis of the Value of Optimal Routing and Signal Timing Control Strategy with Connected Autonomous Vehicles. J. Intell. Transp. Syst. 2022, 28, 252–266. [Google Scholar] [CrossRef]

Figure 1. Publication status of Traffic Signal Control with its application as an optimization and Adaptive Traffic Signal Control system in the last decade.

Figure 2. SI and MI.

Figure 3. Types of TSC and overview of various techniques used in designing TSC.

Figure 4. Publications per year according to the type of intersections used for ATSC.

Table 1. Classification of prevalent traffic microsimulation tools.

Features	SUMO	VISSIM	AIMSUN	MATSim	CORSIM	Paramics
Open source	Y	N	N	Y	N	N
System	CC	CC	CC	CC	DC	DC
Visualization	2D/3D	2D/3D	2D/3D	2D	2D/3D	2D/3D
Pedestrian	Y	Y	Y	N	Y	Y
Scope of application	C	C/R	R/Co	C/R	C/R	C/R
Output	XML file	XML file	Graph based	Text based	XML, CSV files	HTML, CSV, XML files
Import maps	Y	Y	Y	Y	NA	NA
Programming language	CPP,VB, Matlab, Python	CPP,VB, Matlab, Python	Python, CPP	NA	NA	NA
Level of coding	Difficult	Easy	Difficult	NA	NA	NA
Y: Yes; N: No; NA: Not available

Table 2. SI-ATSC using Reinforcement Learning.

Ref.	RL Approach	State	Action	Reward	Compared with
[44]	SARSA	#Vehicle fixed, Constant and variable vehicle distance	Bi phase	Fixed penalty (−1)	FTS different states
[38]	QL	Queue size	Bi phase	Total latency	FTS
[32]	SARSA	# of vehicles	Bi phase	Coefficients of state	FTS, Ac
[45]	QL	Length of queue, Total delay	Gn phase	Change in total delay	FTS
[34]	SARSA, TD error	Queue size, Total delay	Bi and Gn phases	Immediate delay, Total delay, Queue size	FTS, Ac
[33]	QL	Total delay time	Time change in Gn phase	Total latency	FTS
[37]	QL	Max left queue size	Gn phase time	Queue size, Total delay, Throughput	Variable vehicle demand
FTS: Fixed-time scheduling; Ac: Actuated control; #: Number of; Gn: Green; Bi: Binary

Table 3. SI-ATSC using metaheuristic techniques.

Objectives	Mathematical Model	Variables				Intersection		Control System Strategy		Source of Information			Simulator Used	Method	Ref.
Objectives	Mathematical Model	Cycle L.	Green Time	Offsets	Phase Seq.	SI	MI	Fixed Time	Real Time	Sensor\Detector	Camera	Simulate\Various Sources	Simulator Used	Method	Ref.
↓ ATT	N	N	Y	N	N	Y	N	Y	N	N	Y	N	Vis sim	PSO	[25]
↓ Gn at RI	N	N	Y	N	Y	Y	N	N	Y	NA	NA	NA	JADE	ACO	[50]
↓ ATT	N	N	Y	N	N	Y	N	Y	N	N	N	Y	Sumo	GA and Hyper heuristic	[8]
↓ Ql at RI	N	Y	Y	Y	N	Y	N	Y	N	N	Y	NA	NA	PSO	[51]
↓ Avg. delay at RI	Y	Y	Y	NA	NA	Y	NA	Y	NA	NA	NA	Y	NA	BC	[7]
↓: Dec; Y: Yes; N: NO; NA: Not applicable; Avg: Average; Gn: Green time

Table 4. MI-ATSC using Reinforcement Learning.

Ref.	Approach	Solution Strategy	Scenario	Simulator	Compared with
[75]	MBRL	Congestion value sharing	12 RIs	GDL	TC-1 [76]
[70]	Bayesian trans func.	MOL	12 RIs	GDL	TC-1 [76]
[77]	Bayesian trans func.	MOLAC	12 mixed intersections	GDL	TC-1 [76]
[78]	MBRL with Bayesian trans func.	MOLAC	22 RIs	GDL	TC-1 [76]
[79]	QL, AC	FA	2 × 2 grid, 5 RIs	GDL	FTS
[59]	QL	FA	2 × 2 grid, 3 × 3 grid, 5 RIs, 9 RIs	GDL	FTS
[80]	MBRL	MOL	Real-world Beijing road map	Paramics	FTS, Ac, Single-agent RL
[60]	QL	Direct coordination, Indirect coordination	RI in central Toronto	Paramics	FTS, Semi-Ac, Full Ac
[81]	SAFA	MOL Threshold, lexicographic ordering	3 RIs in Stockholm	Sumo	Multiple FA
[62]	QL	2-level hierarchical control	3 × 3 grid	Aimsun	1-level QL
[63]	AC	Tile coding RBF	Real road map metropolitan Tehran	Aimsun	QL, FTS, Ac
[64]	Avg reward	Multi-reward design	8, 11 RIs	Vissim	QL, FTS, Ac
FA: Function approximation; FTS: Fixed-time scheduling; Ac: Actuated control

Table 5. MI-ATSC using metaheuristic techniques.

Objectives	Mathematical Model	Variables				Intersection		Control System Strategy		Source of Information			Simulator Used	Method	Ref.
Objectives	Mathematical Model	Cycle L.	Green Time	Offsets	Phase Seq.	SI	MI	Fixed Time	Real Time	Sensor\Detector	Camera	Simulate\Various Sources	Simulator Used	Method	Ref.
↑ Flow rate	N	Y	Y	N	N	N	Y	Y	N	N	N	Y	Sumo	PSO	[46]
↓ Avg delay, ↓ Ql, ↓ ATT, ↓ cost	Y	Y	Y	Y	Y	N	Y	N	Y	N	N	N	Sumo	NSGA- II, LS, NSGA- II, MODELA	[84]
↑ Flow rate, ↓ Avg. delay	N	N	Y	N	Y	N	Y	Y	N	N	N	N	Sumo	Sumo, PSO	[87]
Improve parameters for over saturated states length at RI	Y	Y	Y	Y	Y	N	Y	Y	N	N	Y	N	Corsim	GA	[82]
↓ ATT	Y	Y	Y	Y	Y	N	Y	Y	N	N	N	N	Sumo	GA	[47]
↓ ATT	N	Y	Y	N	Y	Y	Y	Y	N	N	N	Y	Paramics	SA, GA, CS	[47]
↑: Inc; ↓: Dec; Y: Yes; N: NO; Avg: Average; Ql: Queue length; ATT: Average travel time

Table 6. Miscellaneous techniques for ATSC.

Objectives	Mathematical Model	Variables				Intersection		Control System Strategy		Source of Information			Simulator Used for Test	Method	Ref.
Objectives	Mathematical Model	Cycle L.	Green Time	Offsets	Phase Seq.	SI	MI	Fixed Time	Real Time	Sensor\Detector	Camera	Simulate\Various Sources	Simulator Used for Test	Method	Ref.
↓ Avg. delay at RI	Y	N	Y	N	Y	Y	N	N	Y	Y	N	N	Sumo	Statistical multiplexing, PSO	[97]
↓ CWT	Y	Y	Y	Y	Y	N	Y	Y	N	N	N	N	NA	Artificial BC, HS, and Water Cycle Algo.	[94]
↓ Congestion, ↓ Network interruption, ↓ CWT	N	N	N	N	N	N	N	N	Y	N	N	Y	City flow	Nash-A2C, Nash-A3C	[74]
↓ ATT	N	Y	Y	N	N	N	Y	N	Y	N	N	N	Sumo	NeuroEvolution strategy	[98]
↓ Avg.delay at RI	Y	Y	Y	N	Y	Y	N	Y	N	Y	N	N	NA	GA	[99]
↓ ATT, ↓ Avg. delay	N	Y	Y	N	N	N	Y	N	Y	Y	N	N	NA	PSO, Three sub controller	[100]
↓: Dec; Y: Yes; N: NO; Avg: Average; NA: Not available; CWT: Cumulative travel time

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Agrahari, A.; Dhabu, M.M.; Deshpande, P.S.; Tiwari, A.; Baig, M.A.; Sawarkar, A.D. Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review. Electronics 2024, 13, 3875. https://doi.org/10.3390/electronics13193875

AMA Style

Agrahari A, Dhabu MM, Deshpande PS, Tiwari A, Baig MA, Sawarkar AD. Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review. Electronics. 2024; 13(19):3875. https://doi.org/10.3390/electronics13193875

Chicago/Turabian Style

Agrahari, Anurag, Meera M. Dhabu, Parag S. Deshpande, Ashish Tiwari, Mogal Aftab Baig, and Ankush D. Sawarkar. 2024. "Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review" Electronics 13, no. 19: 3875. https://doi.org/10.3390/electronics13193875

APA Style

Agrahari, A., Dhabu, M. M., Deshpande, P. S., Tiwari, A., Baig, M. A., & Sawarkar, A. D. (2024). Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review. Electronics, 13(19), 3875. https://doi.org/10.3390/electronics13193875

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Intelligence-Based Adaptive Traffic Signal Control System: A Comprehensive Review

Abstract

1. Introduction

2. Preliminaries

2.1. Traffic Signal Control System

2.2. Traffic Signal Timing

2.3. Microsimulation Tools

2.3.1. Microscopic

2.3.2. Macroscopic

3. Review of Previous Research

3.1. Single-Intersection ATSC (SI-ATSC)

3.1.1. SI-ATSC Using Reinforcement Learning: (SI-ATSC-RL)

3.1.2. SI-ATSC Using Metaheuristic: (SI-ATSC-MH)

3.1.3. SI-ATSC Using Fuzzy Logic: (SI-ATSC-FL)

3.2. Multiple Intersection ATSC (MI-ATSC)

3.2.1. MI-ATSC Using Reinforcement Learning: (MI-ATSC-RL)

3.2.2. MI-ATSC Using Metaheuristic: (MI-ATSC-MH)

3.2.3. Hybrid Approach for (MI-ATSC-Hybrid)

4. Discussion

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI