1. Introduction
The demands and expectations of transportation infrastructure users and the complexity of traffic regulation and control in modern cities are driving the need to include novel, advanced solutions into traffic flow optimisation and management [
1,
2,
3]. All urban traffic optimisation and management depends on the feedback signal from sensors, while video-surveillance systems, coupled with autonomous artificial intelligence (AI)-driven decision algorithms, are being actively pursued [
4,
5], various solutions based on different sensors [
6] are commonly applied for different categories of traffic participants.
Common examples, albeit for vehicles, are induction loop systems [
7], which detect the disturbance of the loop’s own magnetic field by the presence of the vehicle’s metallic construction. Induction loop systems generally require a lengthy and complicated installation procedure, as pavement cutting is necessary for the installation [
6,
7]. For pedestrians, which are the focus of this paper, technologies that are seeing increasingly widespread use are the already mentioned video-surveillance traffic systems. These allow for the simultaneous and accurate traffic monitoring of several different traffic areas used by different traffic participants, and can also be quickly and accurately modified [
6]. In general, this type of system is very cost-effective, especially in highly specialised cases, such as distinguishing between different traffic participants.
Despite the relatively simple installation, video-surveillance traffic systems still require frequent maintenance and lens cleaning. However, the main shortcoming of these systems is unreliable operation in low-visibility conditions, and novel approaches are being developed to handle such problems automatically [
8]. Unreliable operation situations mainly occur during the night and in low-visibility weather conditions such as fog, rain and snow. Some video-surveillance traffic systems are even susceptible to incorrectly recognising shadows as traffic participants [
6,
9]. Furthermore, using cameras in public spaces brings up the question of privacy, primarily when these systems are used to monitor pedestrians [
10,
11]. To mitigate the privacy concerns caused by video surveillance, various techniques for privacy preservation have been developed [
12,
13,
14], which is less than ideal because of the additional post-processing.
Radars share many advantages with video-surveillance traffic systems. They are just as capable of recognising various traffic participants and require similar installation procedures [
7]. In contrast, radars are simpler to maintain; are not sensitive to reduced-visibility conditions; cannot invade privacy by design; and can very accurately determine the position, speed and direction of traffic participants within the field of view [
15]. Different radars are already being used in ground traffic control, with the most common type being Continuous-Wave Doppler radar, which is generally only used for collecting speed data [
6]. For other purposes, such as for measuring range or as a volume counting device, the Continuous-Wave Doppler radar is not accurate enough and not a suitable choice, as its signal lacks marking on a time axis [
16]. Along with the Doppler-type radar, the second type of radar that is often used in ground traffic control is the Frequency-modulated continuous-wave (FMCW) radar, mainly used as a presence detector [
7].
Contribution of This Paper
In this paper, we demonstrate a proof-of-concept FMCW radar as an advanced form of pedestrian traffic light triggering mechanism. The proposed system uses a low-cost off-the-shelf FMCW radar as a kerbside detector which enables adaptive pedestrian crossing solutions with the support of multiple object tracking techniques.
Figure 1 shows the suggested placement of a crosswalk radar with highlighted areas of interest, where pedestrians are detected. Our proposed solution relies on the group tracking GTRACK algorithm [
17], which is used for multiple object tracking. The algorithm was modified to run on an external processor and can be further modified to work universally with similar FMCW radars. Additionally, we prepared a visualisation tool that shows tracked pedestrians in real time.
The first part of this paper describes a brief overview of existing technological solutions for human or pedestrian observation, detection and tracking. It also examines the shortcomings of these solutions and what kind of radar technology would serve as the best choice as an alternative to existing solutions. A short explanation of how FMCW radars work, an overview of the radar that we used and an explanation of how pedestrians are detected and tracked from radar measurements will be covered in this paper. The experimental process for evaluating the system’s performance, along with the experimental results, is described in the third part of this paper, which is followed by discussion and conclusions.
4. Pedestrian Detection and Tracking with GTRACK
The detection and tracking of pedestrians are performed by the group tracking algorithm GTRACK [
17]. GTRACK was initially developed by Texas instruments to be used with their line of mmWave sensors. Since the algorithm was designed to run on the sensors’ integrated processor, we modified it into a python module, so it can run on any external CPU capable of running python. By doing so, we off-loaded the detection and tracking off of the integrated processor. Off-loading the detection and tracking from the integrated processor allows the sensor to process more reflection points while still meeting high real-time requirements. GTRACK was modified with the idea that it can be used alongside the out-of-the-box firmware, which comes pre-flashed on off-the-shelf mmWave FMCW radar. Since GTRACK uses point cloud data as the main input, it can be further modified to work with different sensors, not necessarily with an FMCW radar. This especially benefits sensors with limited processing capabilities and can only output measurements. Nevertheless, ideally, the same device would perform measurement acquisition, detection, and tracking.
GTRACK takes the point cloud data as the input, which is then processed in several steps as spatial filtering in the form of clustering and temporal filtering in the form of tracking. The first step is the prediction step, in which the algorithm estimates the present position of each currently tracked object at time instance n. This step is completed by considering the centroid position of the object’s cluster from the previously known position in time instance .
Next are the association and allocation steps, when clusters in the point cloud data are associated with either one of the currently tracked objects’ track. In the case of a newly detected object, a new unique track is allocated. In the association step, a gate is formed around each predicted centroid. Measurements within the gate are then associated with the nearest existing track.
If any measurements remain unassociated, new tracks are created, associated with clusters of measurements that remained after the association step. This process is similar to DBSCAN clustering [
17] but only completed for unassociated measurements. Measurements are clustered together in the order of closest velocity, then closest distance. A new tracking object is initialized if a cluster contains enough measurement points with a strong enough combined signal-to-noise ratio (SNR). The described process is shown in
Figure 10.
For each different kind of object we want to track, we must initialise a separate GTRACK instance. Each instance contains the general description of the object type, e.g., pedestrian, cyclist, car, or any other traffic participant. For pedestrians, we initialise a GTRACK instance with parameters described in
Table 2,
Table 3 and
Table 4. The parameters were determined empirically by scaling typical human dimensions and space requirements in
Table 5. For depth limit and width limit, a space requirement for a person with an open umbrella was taken [
55]; this also considers the space requirements for a person with walking crutches [
56]. We equated both measurements because an umbrella is of a round shape. We also observed that measurement points scatter of a person without an umbrella was almost always of a cylindrical shape, irrelevant of persons’ orientation respective to the sensor. For the height limit, we considered the average height of an adult male (1.87 m) [
57], that we empirically scaled to 2 m. This height is also closer to 1.92 m, as listed in [
56]. The latter also lists the shoulder width of 99% of adult males at 0.52 m and abdomen width at 0.35 m.
Since a new instance of the GTRACK algorithm has to be run for each different type of object, it makes it additionally beneficial for it to run on an external processor. An external processor can more efficiently handle more concurrent instances than a sensor integrated processor, as the latter also has to manage measurement acquisition.
5. Experimental Evaluation
We designed an experiment with six different scenarios to evaluate pedestrian traffic light triggering. Each scenario was repeated 50-times. In the experiment we assumed that bypassing pedestrians would not remain in the radar’s observation area, and would exit this area quickly. Similarly, we assumed that pedestrians intending to cross the street would remain inside the observation area until they were given a green signal. Thus, in our prototype, the control of the traffic light was based on the time a person remained in the observation area. If a pedestrian remained within the area for a set amount of waiting time, the system would recognize this and act as if a pedestrian call button was pressed. We determined the waiting time before triggering a traffic light change empirically and set it to 10 s.
In the first scenario, participants entered and stood inside the observation area that represented the part of the sidewalk where pedestrians would wait for a green signal to cross the road. In each repetition, only one participant entered and was present in the observation area at a time. In this scenario, we observed how many times the system correctly recognized a waiting pedestrian and triggered the change in traffic signalization. If the system triggered a traffic signalization change, we counted that it responded correctly. If the system did not trigger a traffic signalization change, we counted it as an incorrect response.
In the second scenario, participants only passed by the observation area to check whether the system would correctly recognize that none of the detected pedestrians intends to cross the street and, therefore, should not trigger any change in traffic signalization. Again, in this scenario, only one participant was simultaneously present in the observation area at a time. If a green signal was given despite none of the participants stopping to cross the street, it would only disrupt traffic flow in a real-life scenario, which we counted as incorrect system response.
We also want to track and identify multiple pedestrians since multiple pedestrians may be concurrently present within the observable area. However, only once in a while do some of them stop to wait for a green signal. The latter was tested in the third scenario, in which two or more participants entered the observation area in quick succession, so there were always two or three participants present in the observation area at a time. Some participants left the observation area, and some remained inside the area. The participants only passing by the observation area should not confuse the system, which should still trigger traffic signalization changes for standing pedestrians. If the system triggered a traffic signalization change, we counted that it responded correctly. If the system did not trigger a change in traffic signalization while a participant was waiting for a green signal, we counted this as an incorrect response.
In the fourth scenario, two or three participants entered the observation area and immediately left it as if they had only passed by the street crossing. This behaviour should not confuse the system to falsely trigger traffic signalization changes as none of the participants in this scenario stopped inside the observation area.
In the fifth and sixth scenarios, we repeated the first and second scenarios where one person under an open umbrella entered the observation area to check whether the system still correctly recognized them and responded to their intent, either to cross the road or pass by.
5.1. Experimental Setup
For the experiment, we attached the radar on a vertical pole and set it at the height of
m with an elevation tilt
°. We arbitrarily set the observation area to be 1.5 m in length, 1.5 m wide and 2 m shifted away from the radar. We chose these measures to approximate an area in which pedestrians would stand to wait for the change in traffic signalization, where area length was chosen to be as long as an approximate width of a narrower zebra crossing Area width was chosen to approximate the width of a sidewalk, as shown in
Figure 11 and
Figure 12. The observation area is configured within the setup of the GTRACK algorithm and can be easily adapted to different situations.
To simulate walking on a sidewalk, participants in our experiment always entered by either of the two short edges, depending on the walking direction, and were moving in a tangential direction from the point of view of the radar. Participants who were passing by also exited the observation area by either of the short edges. In contrast, participants who stopped to cross the street exited the observation area by the longer edge as if it faced the zebra crossing.
5.2. Triggering Algorithm
For the experiment, we designed a simple algorithm that handles the GTRACK and it triggers changes in traffic signalization based on the GTRACK output data. On startup, our algorithm initiates the GTRACK algorithm and creates a table tb_targets. Table tb_targets keeps information for all currently tracked pedestrians along with the timestamps of pedestrians’ first detection. Upon detection, GTRACK assigns every pedestrian a unique identifier, which is used the whole time GTRACK maintains a lock on a pedestrian with that identifier. Pedestrians are stored in the table with their unique identifiers, which also serve as table indexing key. In the main program loop, at the beginning of each step, a current time is marked and stored in variable timestamp. Following that, GTRACK returns a list of detected pedestrians, which are then stored in the list detected_targets.
Based on the presence and tracking of pedestrians in the observation area, our algorithm either responds to the elapsed waiting time by triggering a traffic signalization change and giving pedestrians a right of way. Alternatively, it keeps the pedestrian crossing closed and maintains an uninterrupted flow of road traffic if no individual pedestrian detains in the observation area for a given tracking time of 10 s. A more detailed description of the triggering algorithm is shown in the flow diagram in
Figure 13 and explained in Algorithm 1.
Algorithm 1 Pedestrian traffic light triggering algorithm |
procedureTraffic light control initiate create repeat ▹Forever set to current time get from for all do: ▹A if target then: save target and timestamp to for all targets in do: ▹B if target then: remove target from else: if target is in observable area longer than threshold time s then: change traffic signalization until shutdown
|
5.3. Results
Table 6 shows our experimental results. The second column shows the correct responses among all experimental repetitions for the given scenario. The third column shows the number of delayed responses of the system. The delayed responses are undesired, but still correct. Those are the cases where GTRACK temporarily lost lock on participants, which led to the reset of the tracking timer and, consequently, a longer waiting time. This column is only applicable to scenarios 1, 3 and 5. The fourth column shows the number of incorrect responses for the given testing scenario. During the experiment, we observed that the setup performed better when pedestrians had a higher radial velocity.
In almost all cases, pedestrians were still being detected, even if they had reduced radial velocity, for example, if they walked by the pedestrian crossing. It merely took more detection frames before GTRACK allocated clusters of observed pedestrians. This could easily be fixed with the below-proposed solutions. We could either elongate the walking strip, which represented the observable pavement area, or we could set the radar to face in the walking direction of pedestrians. The latter would also improve the detection of pedestrians obstructed by other pedestrians walking by their sides. However, this setup would, at the same time, obscure pedestrians who are walking in the same file. Nevertheless, pedestrians obstructed in this direction would possibly still be more easily detected since they would have better diversity in radial velocity.
In a few cases in scenarios one, three and five, GTRACK lost the lock on pedestrians because they stood too still. However, when they moved a little bit, GTRACK detected them again, which resulted in a delayed response of the traffic light triggering algorithm because the tracking timer restarted. In two cases in scenario three, the algorithm did not obtain a lock on a waiting pedestrian, as their radial velocity was not high enough for the GTRACK algorithm to detect them successfully. Furthermore, in one other case in the same scenario algorithm lost track of the standing pedestrian and did not recognize them again the second time. In some cases, in scenarios two and four, GTRACK did not lose lock after pedestrians left the observation area. This was due to when pedestrians moved too close to moving clutter when they exited the area, so the lock-on from pedestrians exiting the observation area was sometimes transferred to moving clutter. A similar error happened when a tracked pedestrian was exiting the observation area where another pedestrian entered, so exiting and entering pedestrians passed each other just at the edge of the area. In that case, the track from the exiting pedestrian was transferred to the entering pedestrian, which did not stop the tracking timer of the exiting pedestrian, nor did it start a new timer for entering pedestrian.
If we count all correct and delayed responses together as and all incorrect as with a total of , we obtained a system performance of or 95.67% and an error of or 4.33%. However, if we count correct responses separately as and we count delayed responses along with the incorrect responses as with a total of , we obtained a system performance of or 92.33% and an error of or 7.67%. Furthermore, if we were to exclude the delayed responses and count only correct and incorrect responses with a total of , we obtained a system performance of or 95.52% and an error of or 4.48%.
A separate evaluation of scenarios one, three and five shows, that the system correctly recognized a waiting pedestrian in cases, combined with cases of system’s delayed response. With incorrect responses over a total of 150 cases, we obtained a system performance of or 98% and an error of or 2%.
If we similarly evaluate scenarios two, four and six, we can observe that system correctly disregarded pedestrians who were only passing in cases and mis-triggered in cases in a total of 150 cases. This gives us the performance of or 93.33% and an error of or 6.67%.
Figure 14 shows an example of two pedestrians walking towards each other. Box frames represent the approximate calculated position of each pedestrian, green dots on the floor show previous locations of tracked pedestrians, and blue points are points of reflections detected by the radar. From these points, it is also impossible to recognize any identifiable features of pedestrians. An example of three separate pedestrians’ tracks is shown in
Figure 15.
6. Discussion
Our results have shown that we were already able to detect and track pedestrians, along with their intent, by using a fairly simple algorithm. By using this as a basis, some more complex functionalities could be implemented even with the current setup. For example, a pedestrian call extension for pedestrians entering the radar’s observation area while the street is open for crossing. Though it is still better to have the observation area extended over a whole crosswalk for more reliable operation [
20]. Our waiting pedestrian presence detection was also based on a fixed continuous observation time, which could be further studied as in [
27]. These and other more complex functionalities can be implemented by using simple logic algorithms or perhaps training a neural network instead. Research in arrays of multistatic radar sensors, that are connected in a network [
58], provides even more coverage and is opening new possibilities in advanced pedestrian tracking behaviors. This method additionally benefits by migrating GTRACK to a separated processor, as it would be easier to modify a single GTRACK instance to detect and track targets of multiple radars within the same multistatic configuration.
Besides logistical benefits, this method also has the potential to decrease traffic accidents involving pedestrians and, since this method also minimizes unnecessary vehicle stops [
9], it can help to reduce the carbon footprint. An additional benefit of the proposed system is that it mitigates the need to touch the call button, which is especially important in times of epidemics, where touching a public surface might increase the possibility of infection. Using contactless detectors like one proposed in this paper or those described in [
9,
20,
48], can contribute to slowing the spread of virulent diseases.
Since the proposed system is operating within an unlicensed radio frequency (RF) spectrum between 57 GHz and 71 GHz and with an average
dBm, it does not require any permissions from a regulator as, for example, Federal Communications Commission (FCC) [
59]. Its operating power may need to be reduced to the average
dBm to comply with the regulations. However, power requirements are set slightly differently, depending on the regional regulator of the RF spectrum. Additionally, since FMCW radars operate on different sweep frequencies, we do not expect to cause or suffer any interference from other FMCW radars, which is additionally beneficial for testing in a real-life scenario. To evaluate the system in a real-life scenario, we would thus need to acquire approvals from the local authorities, where testing would be conducted and from the operator of the experimental testing intersection.
7. Conclusions
In test scenarios, where we evaluated the performance of the proposed system for activating the green pedestrian signal, we have observed that the system responded correctly in 277 cases out of a total of 300 repetitions across all six experimentation scenarios. In 10 cases, the system’s response was delayed, but it still responded correctly for a total performance of 95.67% and an error of 4.33%. However, in the 10 cases where the system’s response was delayed, this was due to the system losing lock on a waiting pedestrian for a short time, leading to longer waiting times for those pedestrians. The system struggled most in cases where pedestrians arrived in strong tangential directions with low radial velocities. Pedestrians having low radial velocities then led to longer detection times. Compared to video-surveillance systems that either use a standard video camera or an infrared camera, this performance is constant through any lighting conditions. We want to point out that all of the experiments were performed in a dry weather environment. Therefore, the proposed system performance would have to be similarly evaluated in future studies, where the experiments would be performed in foggy and rainy weather.
Assessing different setups of radar position and observation areas is left for future research, the most interesting of which is using two radars to observe the same area. The system’s accuracy in positioning-detected targets is also yet to be evaluated. To do this, we need to use a system with known higher accuracy and one preferably not based on radar technology because, as we have observed, these radars struggle with targets moving in a tangential direction. An interdisciplinary study on the field of psychology may also be considered to find an optimal waiting time before the system triggers the change in traffic signalization.
To take full advantage of this design, we could extend the radar’s observation area over the whole crosswalk and continuing tracking while pedestrians have a green signal. This observation area extension makes it is possible to further optimize traffic flow by changing to a red signal only immediately after there are no more pedestrians crossing the street [
9]. Furthermore, because observed pedestrians were moving in a tangential direction in respect to the radar, extending the observation area would allow the radar to face incoming pedestrians at a more favorable angle. We want to note that the radar can also be rotated in the azimuthal direction, which could, depending on the setup, also improve the radar’s detecting and tracking capabilities.