In this section, the specific details of the edge swarm intelligence ATSC environment (the scene of traffic signal control), the global traffic flow prediction (involved in collaborative judgment), and the CRO-CPSO model (the core of our solver technique) are introduced.
2.2. The Global Traffic Flow Prediction
This subsection describes the pheromones in the global traffic flow prediction, the input and output fuzzy membership of the fuzzy logic system, and finally the update strategy for the states of vehicle units.
We designed a method to aggregate and process edge-distributed road pheromone information which can be used for traffic flow prediction. Short-term real-time vehicle waiting time prediction is used as the collaborative judgement on local traffic signal control strategies and provides support for local coordinated adjustment. In the edge-distributed ATSC environment, the underlying data constitute the inherent data set for a road, such as road structure names, identifications, geographical location and its graph topology. The dynamic data set of the road includes real-time vehicle speed and positional information and predictive vehicle routes. The data are transmitted to MEC servers via vehicle units and RSUs and are then further filtered and processed. The MEC servers obtain real-time speed and positional information from vehicle units and obtains inherent road data sets from RSUs. Then, MEC servers predict short-term real-time vehicle waiting time at the intersection and send the prediction to vehicle units. MEC servers share and coordinate with their neighboring servers via I2I.
Aggregating and generating global traffic flow predictions is a way to directly reflect the real-time state information of vehicle units in highly dynamic road networks. The pheromones used in global traffic flow prediction mainly include VRA (the current road name of the vehicle unit), VST (the time of vehicle unit switching state), VRN (the number of remaining roads for the vehicle unit) and so on. VRA can be obtained by analyzing the real-time location and speed of a vehicle unit and road map topology via V2I. When a vehicle unit runs on a road section at normal speed, then VRA is the name of this road section. When the real-time speed of the vehicle unit is lower than the normal speed, then VRA is “waiting_state”. The state of the vehicle unit mainly includes two states: moving and waiting. VST is mainly divided into two types: the remaining time from moving state to waiting state and the remaining time from waiting state to moving state. VST is mainly composed of experienced travel time, intersection delay caused by traffic signal control and delay caused by traffic congestion. VST can be calculated from the real-time vehicle unit speed and the road map topology obtained via V2I and the real-time queue at the intersection obtained via I2I. VRN is the prediction of the remaining path of the vehicle unit which can be obtained via V2I.
The first step is to calculate the experienced travel time. The speed information of vehicle units on the road network is updated frequently; thus, we calculate the real-time spatial average speed and use the real-time spatial average speed to calculate the experienced travel time. The real-time spatial average speed is the average speed of k samples within a time interval t, which can be calculated by Equation (1). Then, we divide the length of the road section travelled in time interval t by the real-time spatial average speed to obtain the experienced travel time, as shown in Equation (2).
where
is the instantaneous speed of the mth sample of vehicle unit
and
is the length of the road section travelled by vehicle unit
in time interval t.
Then, we calculate the delay caused by traffic congestion. The traffic flow is a continuous cycle, and the subjective factors of drivers are uncontrollable. Any slight change in the traffic cycle may result in a mismatch between the calculated and optimal cycle. Thus we use fuzzy logic [
33] in each successive cycle to overcome this mismatch. We construct the fuzzy system for the waiting time of vehicle units by establishing relations between the inputs and outputs of the fuzzy system using if–then rules. The proposed fuzzy logic system consists of four inputs:
VQL (the queue length of vehicle units),
IWJ (the judgement of whether to wait at the intersection),
GPD (the duration of the green phase) and
VP (the priority of vehicle units).
The
VQL is the remaining queue length of vehicle units in the traffic flow. It contains three membership functions named
Zero,
Short and
Long that range from 0 to 25 vehicle units, as illustrated in
Figure 2a.
The
IWJ is the judgement of whether a vehicle unit is waiting at the intersection. It contains three membership functions named
Ahead Oftime,
Ordinary and
Delayed that range from 0% to 100%, as illustrated in
Figure 2b.
The
GPD is the duration of the green phase of the traffic light. It contains three membership functions named
Short,
Medium and
Long that range from 0 to 60 s, as illustrated in
Figure 2c.
The
VP is the priority of vehicle units on the road network. It contains three membership functions named
Optimal,
Suboptimal and
Ordinary that range from 0% to 100%, as illustrated in
Figure 2d.
The output of the fuzzy logic system
WT (the waiting time of a vehicle unit) is used to identify the delay caused by traffic congestion. It contains five membership functions called
RU (Road Unobstructed),
LC (Light Congestion),
MC (Moderate Congestion),
SC (Severe Congestion),
TU (Traffic Tie-Up) that range from 0 to 100s, as illustrated in
Figure 3. The proposed fuzzy logic system comprises forty-five fuzzy rules. Some of these rules are shown in
Table 1, below.
The state of the vehicle unit needs to be updated constantly to ensure real-time and accurate representation of the current traffic status. The update strategy for the state of vehicle units is shown in Algorithm 1. With increase in
,
VST decreases synchronously. When
VST is non-zero, the vehicle unit will maintain the current state unchanged. When
VST is zero, we need to judge whether the intersection of the current road section is the destination of the vehicle unit. When the vehicle unit does not reach the final destination, it will continue to drive. Thus, it is necessary to analyze the change of the state of the vehicle unit. If the vehicle unit is in “
waiting_state”, then it ends the “
waiting_state” and continues to drive on the next road section.
VST will be reset and the experienced travel time for the next road section and intersection delay caused by traffic signal control will be added, as shown in Equation (3). If the vehicle unit was in the driving state before, then it arrives at the intersection. We need to judge whether the vehicle unit needs to wait for the traffic light. If the vehicle unit does not need to wait for the traffic light, it will pass through the intersection directly and continue driving on the next road section.
VST will be reset and calculated, as shown in Equation (3). When the vehicle unit needs to wait for the traffic light, its
VRA will be “
waiting_state”.
VST will be reset and WT will be added, as shown in Equation (4). When the vehicle unit arrives at its destination, we will no longer predict its real-time vehicle state, thus saving computing resources.
where
is the intersection delay caused by traffic signal control.
Algorithm 1 The update strategy of the state of vehicle unit |
1: | if && && then |
2: | Analysis of vehicle unit queuing: |
3: | if then |
4: | Update VRA to name of next road section; |
5: | Reset VST; |
6: | ; |
7: | ; |
8: | else if then |
9: | ; |
10: | Reset VST; |
11: | ; |
12: | end if |
13: | else if && then |
14: | Update VRA to the name of next road section; Reset VST; |
15: | ; |
16: | ; |
17: | else if && then |
18: | ; |
19: | Remove the vehicle unit from the predicting cycle; |
20: | end if |
2.3. CRO-CPSO Model
This subsection describes the fitness function, the co-factor set, the solution encoding and finally the optimization procedure for our proposed CRO-CPSO algorithm.
Research [
29,
30,
31,
32] on traffic signal control optimization has shown that swarm intelligence algorithms can outperform traditional methods in many cases. However, swarm intelligence algorithms have slow convergence speeds when dealing with multi-constraint optimization problems and cannot be well adapted to the current problem of large-scale and complex road networks. Thus, this paper proposes a distributed adaptive cooperative chemical reaction–cooperative particle swarm optimization (CRO-CPSO) algorithm. CRO-CPSO generates and iterates the local strategy at the edge via a distributed structure. It avoids the exponential growth of the solution space when confronted with large-scale road networks. The distributed structure also promotes cooperative control among traffic signal lights in the surrounding area. We use energy exchange as an indicator scheme to achieve an adaptive combination of local search and global exploration in the solution space. We also use a co-factor set to realize cooperative and coordinated control actions among adjacent edges and offline learning with historical traffic flow data.
When using the swarm intelligence algorithm to solve traffic signal control optimization, the scale of solution space will increase exponentially with the continuous expansion of a road network and the increasing complexity of intersections. Using the traditional centralized method, even if there is a powerful centralized cloud computing server, there will be a high computing cost and overhead. Thus, it cannot adapt to large-scale road networks. With the development of edge computing, the computing resources deployed on the edge have a wide application prospect. In our proposed CRO-CPSO, the local strategy for each intersection is generated using the computing resources of edge servers deployed at the intersections. It realizes the parallel utilization of distributed computing resources and reduces the computing load of the cloud server. The co-factor set is maintained on the adjacent edge server to realize offline learning of historical traffic data and cooperative control of adjacent traffic signals.
The traffic signal control optimization problem in this paper is a typical objective optimization problem. The solution space of the signal control set is divided into two parts: and , where is the green phase sequence of each traffic light at the intersection, is the duration of the green phase of the traffic light at the intersection and is the number of traffic lights at the intersection.
Considering information on various events during the simulation, the fitness function of CRO-CPSO is shown as Equation (5). Equations (8)–(10) are the constraints on it. The main objective of Equation (5) is to enhance the driver’s driving experience and reduce driving time. We achieve the incentive effect by increasing the number of vehicles that arrive at the destination within the reward time
and giving the reward score
. We set the decision factor for the reward score (
) so that the vehicle that runs out of time will not receive
. Equation (8) ensures that each vehicle unit only passes through any intersection at most once. Equation (9) ensures that vehicle units with higher priority pass first while waiting at the intersection. Equation (10) ensures that only one traffic light is green at each intersection at any given time.
where
is the reward score,
is the maximum time to ensure the drivers’ driving experience,
is the set of all vehicle units,
is the set of road sections of vehicle unit
and
is the set of wating state of vehicle unit
.
where
is the number of times vehicle unit
arrives at intersection
,
is the set of all intersections,
is the road section starting from intersection
and ending at intersection
, and
is the set of all road sections.
where
is the queuing sequence of vehicle unit
waiting for the green light in road section
when it arrives at intersection
and
is the set
VP of vehicle units arriving at intersection
in road section
in time interval t.
where
is the identifier of the state of the green traffic light in road section
and
is the set of all intersections leading to intersection
.
In this paper, we use a co-factor set to realize the cooperative and coordinated control actions among adjacent edges and offline learning with historical traffic flow data. The co-factor set objectively reflects the congestion of each road section and the potential “Attacking Traffic Flow” (after passing the intersection, the vehicle units will enter the next road section, thus increasing the congestion level and attacking the traffic density of the road section) among road sections. As the experienced set of global road sections, the co-factor set reflects both the past traffic flow information of each intersection and the associated influence among road sections. Thus, we can deepen the degree of cooperation among edge servers in the global control of traffic lights by introducing the co-factor set. It enables edge servers to consider the cooperation among servers in generating local strategies. The joint reward feedback for each edge server will further iterate the co-factor set, so as to take experience and timeliness into account.
As the first step, we calculate the joint reward feedback for each edge server. We take the average waiting time of vehicle units within the coverage area of edge servers as the evaluation index.
is the average waiting time of vehicle units within the coverage area of edge server
in time interval t, as shown in Equation (11).
is the local reward feedback for edge server
, as shown in Equation (12). The adjacent edge servers have traffic flow correlation. Thus, a spatial attenuation factor is introduced so that the joint reward feedback can reflect the gains of the adjacent environment.
is the joint reward feedback for edge server
, as shown in Equation (13).
is the traffic density within the coverage area of edge server
in time interval t, as shown in Equation (14).
where
is the set of all vehicle units within the coverage area of edge server
,
is the set
WT of the vehicle units within the coverage area of edge server
and
is the set of all the edge servers.
where
is the spatial attenuation factor and
is the set of the adjacent edge servers of edge server
. Due to the fast attenuation of space, the set of adjacent edge servers only consider a two-layer road network structure.
where
is the set of all the road sections within the coverage area of edge server
,
is the number of vehicle units on the road section
within the coverage area of edge server
in time interval t and
is the length of the road section
within the coverage area of edge server
.
Then we construct the co-factor set
as a matrix of
as shown in Equation (15). The co-factor set is acquired through offline learning and iterative updating with immediate joint reward feedback, as shown in Equation (16).
where
is the benchmark distance,
is the distance between edge server
and edge server
, and
and
are the attenuation factors.
The swarm intelligence algorithm can be used to solve the multi-objective optimization problem of traffic signal control. In the traditional swarm intelligence algorithm, each iteration of the solution space applies a local search algorithm, which requires a lot of computing resources and reduces the convergence efficiency. Edge servers only have limited computing resources and storage capacities. Thus, we adopt the optimization framework of the chemical reaction optimization algorithm [
34] to accelerate the convergence of traffic signal control optimization strategies. We adopt the PSO [
35] algorithm with excellent inter-individual coordination and introduce the co-factor set under the CRO framework to realize local cooperative scheduling and global control among traffic lights in the multi-dimensional control problem of large-scale traffic light control. By configuring two necessary attributes, molecular potential energy (
) and molecular kinetic energy (
), the algorithm can avoid falling into local optima too early and converge to optima faster.
represents the stability of the solution space and is defined as the reciprocal of vehicle travel time, as shown in Equation (17). When
increases, the solution space tends to be stable. Thus, the global exploration will be stopped and local searching will be performed.
makes the solution space tend to be dynamic. When
is high, the global exploration will be continued to avoid falling into local optima too early. Therefore, only when KE attenuates to a threshold value and PE tends to be stable, will local searching be carried out, thus greatly improving the convergence efficiency. Thus, the algorithm can be deployed on edge servers with limited computing resources.
The pseudocode of CRO-CPSO is shown in Algorithm 2. The input of Algorithm 2 is road pheromone information and the global co-factor set; its output is the local traffic light regulation strategy. The solution space
is the green phase sequence of each traffic light at the intersection, where
is an array of different integers ranging from 1 to
. In this paper, we use
and
to run the global exploration of the solution space. When the solution space tends to be stable, we use
and
to run the local search in the solution space.
Figure 4 shows the updating scheme for the solution space
of dimension 6.
Algorithm 2 Pseudocode of CRO-CPSO |
1: | InitialSwarm() |
2: | |
3: | while do |
4: | for each particle do |
5: | if then |
6: |
|
7: | else if then |
8: |
|
9: | else if then |
10: |
|
11: | else if then |
12: |
|
13: |
end if |
14: |
|
15: |
|
16: |
|
17: | Update |
18: | Update |
19: |
end for |
20: | end while |
When the solution space satisfies the condition of
Self-Collision,
will occur. As shown in
Figure 4a, the solution space is slightly changed. We use the method of Swapping Two-Domain Spaces to ensure that the spatial arrangement of S’ is not repetitive. After the Collision, the KE of S’ decays and the PE of S’ is updated. When the solution space satisfies the condition of
Self-Decomposition,
will occur. As shown in
Figure 4b, the solution space is changed considerably. Select a breakpoint for S and divide it into two parts
. S1 retains
and S2 retains
. Then, the remaining parts of S1 and S2 are generated while ensuring that the arrangement of solution spaces is not repetitive. The KE between the solution spaces is redistributed and the PE of the solution spaces is updated.
When the solution space satisfies the condition of
Intergroup-Collision,
will occur. As shown in
Figure 4c, the solution space is slightly changed. In the
, we use
Conflict-Detection to map the conflicting values so as to ensure the non-repeatability of the arrangement between S1′ and S2′. In the example of
Conflict-Detection in
Figure 4c, we can see two sets of mappings
and
. Thus, if there are two
s in the solution space after the
,
will be converted to
, and so on, until there is no conflict. After the Collision, the KE between the solution spaces is redistributed and the PE of the solution spaces is updated. When the solution space satisfies the condition of
Intergroup-Synthesis,
will occur. As shown in
Figure 4d, the solution space is changed considerably. Choose a synthesis point for the two solution spaces. Then, the
of S1 and the
of S2 are combined to produce a new solution space S with great diversity. In the
, we also use
Conflict-Detection to map the conflicting values so as to ensure the non-repeatability of the arrangement of S. After the synthesis, the KE of the solution spaces is aggregated, and the PE of the solution space is updated.
The solution space is the duration of the green phase of a traffic light at an intersection, where is an array of different integers ranging from 0 to 60 s. When edge server i generates the duration of the green phase of the traffic light, we introduce co-factor set into the solution space. We focus on the k-neighbor road sections that are directly related to the current intersection in updating the solution space. The co-factor reflects the potential “Attacking Traffic Flow” (including the potential traffic flow of other road sections extending from intersection ) at intersection in the direction of road section . Thus, when accounts for a large proportion of , it indicates that the road section may become congested. Thus, experience-based coordinated regulation can be implemented to alleviate the potential congestion of road sections. We need to extend the green phase of road section in an appropriate proportion within the green cycle. The green phases of other road sections with smaller proportions in are appropriately compressed. The ratio of extension to compression is determined by the weight ratio. The extension–compression ratio is also affected by the experiential factor due to the empirical lag of the co-factor set, as shown in Equation (18).
We introduce the idea of particle swarm optimization in the iteration of
. Each potential solution to the problem is the position of the particle, and the particles are updated iteratively on a population scale. The particles are initialized before the iteration begins. The fitness of the particles is calculated using Equation (5) as the initial
and
of the particles. Then, the circular heuristic search process is initiated:
and
(
for the particle is rounded in the updating process) for the particle are updated iteratively, and its fitness is calculated. If the fitness is better than
, update
. If the fitness is better than
, update
. In each iteration update, the particle updates its position according to Equation (18):
where
is the experiential factor,
and
represent rounding processing and
is the velocity of the particle, as shown in Equation (19):
where
is the individual extremum,
is the global extremum,
and
are learning factors,
is a uniform random value in [0, 1] and
is the inertia weight, as shown in Equation (20):
where
is the maximum of the inertia weight,
is the minimum of the inertia weight and
is maximum iteration.