*Article* **Robust Traffic Light and Arrow Detection Using Digital Map with Spatial Prior Information for Automated Driving**

#### **Keisuke Yoneda 1,\*,†, Akisuke Kuramoto 2,†, Naoki Suganuma 1, Toru Asaka 1, Mohammad Aldibaja <sup>1</sup> and Ryo Yanase <sup>1</sup>**


Received: 25 December 2019; Accepted: 17 February 2020; Published: 21 February 2020

**Abstract:** Traffic light recognition is an indispensable elemental technology for automated driving in urban areas. In this study, we propose an algorithm that recognizes traffic lights and arrow lights by image processing using the digital map and precise vehicle pose which is estimated by a localization module. The use of a digital map allows the determination of a region-of-interest in an image to reduce the computational cost and false detection. In addition, this study develops an algorithm to recognize arrow lights using relative positions of traffic lights, and the arrow light is used as prior spatial information. This allows for the recognition of distant arrow lights that are difficult for humans to see clearly. Experiments were conducted to evaluate the recognition performance of the proposed method and to verify if it matches the performance required for automated driving. Quantitative evaluations indicate that the proposed method achieved 91.8% and 56.7% of the average f-value for traffic lights and arrow lights, respectively. It was confirmed that the arrow-light detection could recognize small arrow objects even if their size was smaller than 10 pixels. The verification experiments indicate that the performance of the proposed method meets the necessary requirements for smooth acceleration or deceleration at intersections in automated driving.

**Keywords:** image processing; traffic light detection; intelligent transportation system

#### **1. Introduction**

Automated vehicle technologies are considered to be the next generation transportation system. Many companies and research organizations are involved in the research and development of such technologies. Recent automated vehicle technologies focus more on urban driving, which is a mixed transportation environment with human drivers. Public road demonstration experiments have been carried out in the U.S., European, and Asian countries since the mid-2000s [1–3]. Automated driving in mixed environments human driven vehicles, pedestrians, and cyclists requires recognition of surrounding objects autonomously and decision making according to traffic rules. The following sensors are mainly mounted on automated vehicles for surrounding recognition.


The above mentioned sensors make it possible to observe surrounding transportation environments. In addition, recent automated driving technologies rely on high-definition maps (HD maps) which include the precise position of static road features such as lane boundaries, lane centerlines, traffic signs, and traffic lights. By referring to the predefined road features, it is possible to reduce false surrounding recognitions and implement accurate decision making, considering road structures. The following functions must be implemented in order to achieve automated driving on public roads.


This study focuses on traffic light (TL) detection using HD maps. There are many studies on TL detection by using vision-sensors in intelligent transportation system. TL is one of the most important road features in order to decide an approaching the intersection. Although HD maps have the information of TL positions, the vehicle must recognize the current state of TLs in real time because it changes dynamically. For safety deceleration, it is necessary to recognize the current state of TLs at distances over 100 m. The required recognition distance can be estimated by calculating the braking distance from the vehicle to the stop line after smoothly recognizing the TL state. In studies on vehicle control [4,5], the deceleration without discomfort to passengers is approximately 0.1 G ( =0.098 m/s2). For example, when the vehicle decelerates by 0.1 G while traveling at a velocity of 50 km/h, the braking distance is approximately 98 m. Furthermore, the recognition distance may increase further when considering the case where the TL is located at a position away from the stop line. Recognizing TLs in the ranges exceeding 100 m is required to make a natural intersection approach in automated driving. In order to implement a practical method of TL recognition, it is necessary to discuss the effectiveness of the methods, considering the trade-off between the required performance and the hardware specification. For example, installing a high resolution camera or a telephoto lens is an easy solution to increase the recognition distance. However, increasing the resolution may increase the processing time. In addition, the field of view is narrowed and TLs may be left out. From the point of view of implementing a recognition method, it is important how to recognize small pixel objects.

On the other hand, in automated driving using HD maps, the self-localization module precisely estimates the vehicle pose by map-matching using a range sensor or image sensor [6–8]. Generally, position accuracy of approximately 0.1 to 0.2 m is considered to be necessary for decision-making and path planning in automated driving. Assuming that the precise vehicle position on the digital map is estimated, a region-of-interest (ROI) location for the TLs can be calculated using the registered TL position and current vehicle pose. Extracting the ROI makes it possible to reduce the search region of TLs. It is then possible to reduce false detections such as false-positive and false-negative detections, and computational costs [9–12]. In addition to improving recognition performance, associating TLs registered on a map with TLs in an image is an important aspect of the map-based recognition. In the decision making using the HD maps, it is necessary to grasp the state of the relevant TLs, in order to make an approach decision at the intersection.

The purpose of this study is to achieve small object recognition for both TLs and arrow TLs. Different types of candidate detectors are introduced to realize robust lighting area detection. In addition, in order to implement robust arrow recognition, the proposed method uses spatial information of the positional relationship of lights in TLs as prior information obtained from the HD map. The proposed method is specialized in the recognition of small-sized objects, and evaluate recognition performance for actual driving data. Moreover, verification experiments for automated driving were carried out by introducing the proposed method to investigate the validity of the method.

The rest of this paper is composed as follows. Section 2 describes related works. Section 3 introduces the proposed TL detection method. Section 4 presents evaluations for the proposed method with some discussions. Finally, Section 5 concludes the paper with some remarks.

#### **2. Related Works**

Table 1 summarizes the reported results of major research cases on the TL recognition. Based on the state-of-the-art researches, the recognition procedure can be described as follows:



**Table 1.** Summary of relative studies for traffic light recognition.

Although different approaches have been developed, most of the methods involve extracting candidates according to their specific color spaces [14,20] and circular shape [10,21], and identifying them as TLs or arrow TLs. Regardless of the country, the TLs mainly consists of circle and arrow shaped lights. In the case of the ROI-based recognition, detection of circular objects is one of the effective approaches to recognize lighting areas because the search region is limited in it. According to Table 1, many methods adopted a blob detector which extracted candidate objects by binarizing the image and segmenting pixels [11,13,14]. It can detect circular objects even if their size is a few pixels. Then, the recognition of the whole shape of the TLs are implemented using specific shape matching and machine learning. Moreover, the effect of introducing object tracking to stabilize the recognition result has been reported [14,22]. In recent years, there have been reports of cases in which performance is improved upon using deep neural network (DNN) [11,14,16,18,23]. In order to detect arrow TLs, machine learning-based detector is a key solution. As shown in Table 1, it has been reported that these methods can recognize TLs at distances exceeding 100 m, with a recognition rate of approximately 90%. However, it is difficult to directly compare each performance, because the specifications of the camera (image sensor, resolution, and field of view), the driving scene, and the quality of data are different. Herein, we discuss algorithm limitations by comparing the pixel size of recognizable objects.

On the other hand, assuming that our algorithm will be introduced into the automated vehicles, real-time proceeding is important for decision making. In addition, in order to reduce the delay in recognition, it is necessary to recognize TLs in an appropriate processing time in accordance with the velocity of the vehicle. For example, when traveling at a velocity of 50 km/h, a vehicle moves about 14 m per second. Then, it is important to estimate the required time in consideration of the responsive deceleration for practical development.

In our previous study [24], the TL recognition method was proposed using sped up robust features (SURF) [25]-based circular object detector. It can detect circular objects like a blob detector without binarization, therefore the robust candidate extraction is expected. In addition, the proposed method estimates the existence probability of lighting objects in an image. It has the advantage of reducing false positive detections which caused by surrounding lighting objects. In this work, we improve our method by integrate the typical candidate detectors such as a circular detector using SURF, a blob detector, and the TL shape detector. In particular, we investigate the performance and limitation of the arrow detection method by introducing prior information in order to robustly detect arrow TLs. Moreover, we verify that the performance requirements for the recognition distance are satisfied, by performing automated driving on an urban road. The followings are the contributions of this paper.


#### **3. Proposed Traffic Light Detection**

#### *3.1. Traffic Light Detection*

Before describing the proposed algorithm, the problem of the TL recognition addressed in this study is explained. In the TL recognition, the task is to recognize the state of the TLs in the image that corresponds to the HD map. As shown in Figure 1a,b, it is necessary to properly recognize the lighting status of TLs both in the day and the night. On the HD map, the position information of the TLs is recorded individually, and then the TL positions in the camera image can be calculated from the position of the TL and the vehicle. Figure 1c indicates the typical ROI image which is extracted by the coordinate transform using the HD map for a driving image. As in the enlarged image in Figure 1c, the extracted ROI image may include TLs other than the ones, and background lighting objects. In implementing automated driving at intersections, the purpose is to recognize the TL associated with the ROI. Therefore, if a different TL is recognized in the specific ROI, it will be a false-positive detection.

Figure 2 shows the TL patterns to be recognized by the proposed method. This study focuses on the recognizing TLs in the Japanese traffic environment. We deal with the TL patterns that include three basic types of lights (green, yellow, red) and three types of arrow lights (left, straight, and right) that exist depending on the road environment in Japan. In addition, because there are horizontal and vertical TLs depending on the area in Japan, the proposed method recognizes these patterns as well. As a special case, there are cases where arrow lights in different positions and arrow lights in different directions as shown in Figure 2, are installed in the actual environment. Evaluation of recognition performance for such special situations has not been performed in this work, but it can be easily extended by using the digital map information described herein, as a prior information. Although the proposed method will be evaluated for Japanese traffic images in this work, the proposed method is able to apply to general traffic lights which consist of circular lights and arrow lights. In the proposed method, the recognition distance can be improved if the arrangement pattern of the signal light and the arrow light is known for the target TLs as shown in Figure 2.

(c) Region-Of-Interest for recognition

**Figure 1.** Typical traffic light image at different brightness and region of interest (ROI) image.

**Figure 2.** Traffic light patterns that can be recognized in the proposed method.

#### *3.2. Predefined Digital Map*

A highly-precise digital map is maintained by a growing number of professional mapping companies. Accurate positioning systems, combined with cameras and LiDAR sensors can be possible to generate a precise 3-D maps which contains latitude, longitude and altitude reflectivity. For the purpose of the TL detection, location data of each TL was recorded into the map information. The method described in this paper uses the following information as prior map information:


Although the exact altitude of the TLs is not used as a prior information, they are installed at a height of approximately 5.0 m above the ground surface on the Japanese road environment. Therefore, the recognition process is performed by considering a height of 5.0 m as a reference height, and providing a margin that assumes a road gradient. Although the standard height of TLs is specified in Japan, the standard height should be different in other countries. It is necessary to set an appropriate height according to the target country or to set a wider recognition area in the image when the height information is unknown.

#### *3.3. Method*

Figure 3 illustrates a flowchart of the proposed method. It mainly consists of the following five procedures:


**Figure 3.** Flowchart of the proposed method.

As described in Section 2, most of the existing recognition methods mainly use individual features such as circular objects, blob regions, and overall shapes using a machine learning (ML) detector to recognize the TL candidates. The proposed method combines them to detect candidates for robust implementation. By using detection methods that focuses on the lighting area of the TLs, it is possible to recognize them even if the overall shape of the TL is not visible, such as during occlusion or at night time. Figure 4 shows typical driving images in an occluded and a dark scenes. Occlusion of TLs is caused by surrounding other vehicles such as a preceding vehicle, a bus, and a tuck. As shown in Figure 4a, there is a situation that the occluded situation where it is difficult to see the whole shape of the TL. However, it is necessary to recognize the TL state only from the lighting area. The situation where such an overall shape cannot be visually recognized is the same even at night as shown in Figure 4b. Section 4 evaluates the contribution of each method by comparing the recognition performances.

**Figure 4.** Typical traffic light (TL) images in occluded and dark scenes.

Arrow detection requires recognition of directions, which is affected by unclear images. In order to improve such distant arrow recognition, we propose an arrow recognition method using prior information of the HD map. In addition, this study verifies the effects of using prior information in the arrow light recognition. The algorithms are described in detail in the following section.

#### *3.4. Coordinate System and Traffic Light Selection*

Figure 5 illustrates the coordinate systems considered in this work. The latitude and longitude values are converted to 2-D *xg* − *yg* space in the universal transverse mercator (UTM) coordinate system. The world coordinate system is defined as *xg*, *yg* and *zg*(=altitude). The vehicle coordinate system is centered at the rear wheels. The front direction is *xv*, the left direction is *yv* and the upper direction is *zv*. In a similar way, sensor coordinates (*xs* − *ys* − *zs*) and image coordinates (*u* − *v*) are defined as shown in Figure 5a. The world, vehicle and sensor coordinates can be transformed by using rotation and translation matrices. Sensor and image coordinates are transformed by using intrinsic parameters.

Among the TLs that appear in the frontal camera image, the TLs that are within a certain distance *dT*, and whose heading angle difference is within a certain degree *θT*, are extracted as target TLs to be recognized. The target TLs are extracted based on the distance parameter *dT* m from faced traffic signals as shown in Figure 5b. In Figure 5b, the red TLs are the extracted target TLs.

**Figure 5.** Coordinate systems and traffic light selection.

#### *3.5. ROI Clipping*

The ROI image is clipped for each target TL. The location of the ROI can be calculated based on the current pose and the map database. A global TL position *xw* = [*xw*, *yw*, *zw*, 1] is converted to *xv* = [*xv*, *yv*, *zv*, 1] and *xs* = [*xs*, *ys*, *zs*, 1] by the vehicle pose and extrinsic parameters of the camera.

$$\mathbf{x}\_{\mathfrak{v}} = R\_{\mathfrak{uv}\mathfrak{v}} \mathbf{x}\_{\mathfrak{w}} \tag{1}$$

$$\mathbf{x}\_{\mathfrak{k}} = R\_{\mathfrak{v}\mathfrak{s}} \mathbf{x}\_{\mathfrak{v}'} \tag{2}$$

where *Rwv* and *Rvs* are 4 × 4 homogeneous transformation matrices for converting world-to-vehicle coordinates and vehicle-to-sensor coordinates, respectively. As described above, if there is no information on the absolute height of the TL, the general TL height is used to compute *zv*, assuming a flat surface. In this case, the height *zv* is calculated by the following equation:

$$z\_{\upsilon} = z\_0 - \propto\_{\upsilon} \tan \phi,\tag{3}$$

where *z*<sup>0</sup> is the general TL height from the road surface (e.g., *z*<sup>0</sup> = 5.0m), and *φ* is the pitch angle of the vehicle. A pixel position *u*, *v* of the signal is then calculated based on the intrinsic parameters and the following set of equations:

$$\mathbf{x}^{\prime\prime} = \mathbf{x}\_{\mathfrak{s}} / \mathbf{z}\_{\mathfrak{s}}, \quad \mathbf{y}^{\prime\prime} = y\_{\mathfrak{s}} / \mathbf{z}\_{\mathfrak{s}}, \quad \mathbf{r}^2 = \mathbf{x}^{\prime 2} + \mathbf{y}^{\prime 2} \tag{4}$$

$$\mathbf{x}^{\prime\prime} = \mathbf{x}^{\prime}(1 + k\_1 r^2 + k\_2 r^4) + 2p\_1 \mathbf{x}^{\prime} y^{\prime} + p\_2 (r^2 + 2\mathbf{x}^{\prime2}) \tag{5}$$

$$y'' = y'(1 + k\_1 r^2 + k\_2 r^4) + p\_1(r^2 + 2y'^2) + 2p\_2 \mathbf{x}' \mathbf{y}' \tag{6}$$

$$u = f\_{\mathbf{x}} \mathbf{x}^{\prime\prime} + c\_{\mathbf{x}\prime} \quad \upsilon = f\_{\mathbf{y}} \mathbf{y}^{\prime\prime} + c\_{\mathbf{y}\prime} \tag{7}$$

where *fx*, *fy*, *cx*, *cy*, *k*1, *k*<sup>2</sup> are intrinsic parameters of the camera. The ROI is defined as a rectangle with a width *wroi*, and a height *hroi* centered at the pixel *u*, *v*.

$$k w\_{roi} = k\_{roi} \frac{f\_{\text{x} \text{s}\_{\text{s}}}}{z\_{\text{s}}} \tag{8}$$

$$h\_{roi} = k\_{roi} \frac{f\_y s\_s}{z\_s} \,\tag{9}$$

where *ss* is the size of a TL and *kroi* is a constant parameter to determine a scale of the ROI.

#### *3.6. Highlighted Image Generation*

The lighting areas of the TLs have higher brightness and saturation compared to other objects. Therefore, the highlighted image can be generated by multiplying the saturation image by the brightness image. In order to extract lighting areas, RGB images are converted into HSVcolor space. The lighting area of the TL gets highlighted as shown in Figure 6a. The highlighted images have higher brightness and saturation to emphasize the lighting areas of TLs. However, in some cases, the highlighted image cannot emphasize the lighting area sufficiently, especially for the lamp-type TLs. In addition, in recognition of distant TLs where the image is unclear, there is a possibility that false-positive detections may occur under the influence of background noise. In order to solve these problems, we have previously reported a method that can reduce false-detection, such as false-positive and false-negative, in distant places, by correcting and weighting the highlighted images [26]. The following processes are suggested to emphasize TLs:


**Figure 6.** Highlighted image generation.

The first operation updates the image brightness. The brightness value is normalized using the following equation:

$$V\_m(\mu, \upsilon) = k\_{\upsilon} \mathcal{V} + \frac{\sigma\_{\upsilon}}{\sigma} (V(\mu, \upsilon) - \mathcal{V}),\tag{10}$$

where *V*(*u*, *v*) and *Vm*(*u*, *v*) are the original brightness from the HSV image and modified brightness value at pixel (*u*, *v*), respectively. *V*¯ is the average brightness of the original brightness image *V*. *σ* is the standard deviation for *V*, and *σ<sup>v</sup>* is the modified standard deviation for the updated image *Vm*. *kv* is a constant parameter that increases the average brightness.

The second operation updates the saturation values. The lighting area of the traffic signal generally has saturation values above a certain value. The pixels with saturations lower than this value are reduced using the following equation:

$$S\_{\mathfrak{m}}(u,v) = S(u,v) \cdot \frac{1}{\left(1 + \exp\left(-a\_{\mathfrak{s}}(S(u,v) - b\_{\mathfrak{s}})\right)\right)'} \tag{11}$$

where *S*(*u*, *v*) and *Sm*(*u*, *v*) are the original saturation values from the HSV and modified saturation value, respectively. *as* and *bs* are constant parameters of the sigmoid function to reduce the saturation value. *Sm* and *Vm* are used to generate the highlighted image instead of the SV-image.

The third operation multiplies the pixel values of the highlight image with weight, with respect to the hue values. Figure 6b shows the definition of weighting value. It means that the hue value closest to the lighting color of the traffic signal has a higher weight value.

$$\mathcal{W}\_{\mathcal{G}}(\boldsymbol{\mu}, \boldsymbol{\upsilon}) = \exp \frac{-(H(\boldsymbol{\mu}, \boldsymbol{\upsilon}) - \mu\_{\mathcal{G}})^2}{2\sigma\_{\mathcal{G}}^2} \tag{12}$$

$$\mathcal{W}\_{Y}(\boldsymbol{u},\boldsymbol{v}) = \exp\frac{-(H(\boldsymbol{u},\boldsymbol{v}) - \boldsymbol{\mu}\_{Y})^{2}}{2\sigma\_{Y}^{2}}\tag{13}$$

$$\mathcal{W}\_{\mathbb{R}}(u,v) = \exp \frac{-(H(u,v) - \mu\_{\mathbb{R}})^2}{2\sigma\_{\mathbb{R}}^2} \tag{14}$$

$$H\_{\overline{w}}(\mu, \upsilon) = \max(\mathcal{W}\_{\mathbb{G}}(\mu, \upsilon), \mathcal{W}\_{\mathbb{Y}}(\mu, \upsilon), \mathcal{W}\_{\mathbb{R}}(\mu, \upsilon)),\tag{15}$$

where *H*(*u*, *v*) is the original hue value from the HSV. *W*∗(*u*, *v*) is the obtained weight value, *μ*<sup>∗</sup> is a mean value and *σ*<sup>∗</sup> is the standard deviation for weighting for the corresponding colors. *μ*<sup>∗</sup> and *σ*<sup>∗</sup> should be determined according to the color balance of the camera.

#### *3.7. Candidate Extraction and State Classification*

After the generation of the highlighted image, a lighting area detection is applied to the obtained image. As mentioned in Figure 3, the proposed method introduces three types of methods to extract candidate lighting objects.

The first method is the circle detector. The shape of the lighting area is generally circular shape in the image. A method based on the Hough transform has been adopted to extract circular candidates [21]. However, because a clear circular area cannot be obtained in the image for a distant TL, a blob detector described later was adopted in many works. SURF keypoints have a Hessian matrix *H*, the types of the edges can be categorized by using det(*H*) as shown in Figure 7a. Candidate circle areas can be extracted as keypoints with det(*H*) higher than the threshold value *Hmin*. This approach can extract circular objects robustly, because SURF is a robust keypoint for illumination change and scale change.

**Figure 7.** Candidate extraction.

The second method is the blob detector. In the feature image, areas with higher pixel values are distributed near the lighting areas. These areas can be extracted by binarizing and segmenting the image as shown in Figure 7b. Such a method of extracting candidate objects by binarization according to the brightness, saturation and color characteristics has also adopted in many related works [11,13,14]. Results are expected to be similar to circle detection, but the blob detector is expected to be better than the circle detector, when a part of the lighting part is saturated and cannot be visually recognized as a circle shape in the feature image. However, the blob detector is sensitive to threshold adjustment for binarization. If there is a signboard with a color close to the lighting color in the background, it may be detected as false postives. In addition, when the brightness of the lighting area in the image changes due to the influence of the surrounding light, it may be false negatives.

The third method is the ML-detector. Because the detection is performed using camera images, the brightness of the whole image may be affected by the influence of the surroundings, such as sunlight and environmental light. In such cases, the lighting area of the TLs cannot be sufficiently emphasized in the highlighted image. In order to recognize such TLs with lower brightness, it is effective to focus on the whole shape of the TLs together. Generally, machine learning is a common approach to detect such shape of the TL. In the proposed method, the OpenCV cascade detector trained by AdaBoost [27] is used as one of the CPU-based detectors for the TL recognition. In recent years, DNN-based detectors have shown high recognition performance in general object recognition [17,28]. DNN models such as SSD [17] and YOLOv3 [28] are known as typical networks for detecting objects in real-time. However, because the DNN model requires GPU processing, it is necessary to select an appropriate ML-Detector in consideration of the trade-off between computational resources and recognition performance.

The lighting states are classified in the detected objects by these methods, and determined as final candidates. In order to eliminate false-positive detections, objects with comparatively smaller and larger radius are deleted based on the pre-calculated radius *rl* = 0.5 *fxsl*/*zs* from the HD map. Here, *sl* is the diameter of the lamp of the TL. The accepted candidates are extracted according to the minimum radius *kminrl*, and the maximum radius *kmaxwl*, based on the parameters *kmin* and *kmax*. In addition, in order to reduce the processing time, the ML-detector is used only when the circle and blob detectors have not detected any objects.

The lighting color of the candidate object needs to be classified from the hue and brightness distribution of the lighting area. In the proposed method, histograms of the highlighted image and hue image are created for the detected lighting area, and then the AdaBoost classifier is trained using the normalized histograms at the maximum frequency as a feature vector. The generated classifier recognizes the lighting state via four classes, namely Green, Yellow, Red, and Background.

#### *3.8. Probability Updating*

The candidates detected in the ROI are potential objects of the target TL. In order to output a likely object from the obtained candidates, a time-series tracking process is performed by computing existence probability. In [14], multi-object tracking is implemented to improve recognition accuracy. In the proposed method, the whole shape of the TL is not always recognized. Therefore, the probability is estimated by calculating the existence probability of the object in the 2-D image space, instead of general tracking using the target as a mass point. The existence probability is computed using a binary Bayes filter (BBF). The following equation shows a relationship between the log-odds *l* and the probability *p* for the *i*-th target signal at time *t*.

$$p\_i(u\_\prime v | z\_{1:t}, x\_{1:t}) = \frac{1}{1 + \exp(-l\_{t,i}(u\_\prime v))}\prime \tag{16}$$

where *z*1:*<sup>t</sup>* and *x*1:*<sup>t</sup>* are the observation and the vehicle state until time *t*, respectively. (*u*, *v*) is the pixel location in the image and *lt*,*i*(*u*, *v*) is a log-odds value at the pixel (*u*, *v*) for the *i*-th TL. The log-odds can be updated by additional computation in BBF.

$$I\_{t,i}^{Prior} = a \, I\_{t-1,i}^{Post} + I\_{t,i}^{Obs} \, , \tag{17}$$

where *lPost <sup>t</sup>*−1,*<sup>i</sup>* is the posterior log-odds at the previous time for the *<sup>i</sup>*-th target signal. The initial value of *lPrior <sup>t</sup>*,*<sup>i</sup>* is set to 0. *<sup>α</sup>* is the decay rate for the previous posterior probability. *<sup>l</sup>Obs <sup>t</sup>*,*<sup>i</sup>* is calculated based on the position and size of the obtained candidates. Figure 8 shows a typical example of the probability updating. For each detected candidate, the rectangular area where the TL may exist is calculated, and the observation distribution *lObs <sup>t</sup>*,*<sup>i</sup>* is determined based on the Gaussian distribution around the rectangular areas. Then, it is possible to estimate a likely region by performing time-series processing.

**Figure 8.** Probability updating.

#### *3.9. Arrow Signal Recognition*

In addition to the TL detection, an arrow signal is recognized when the TL has the attribute of arrow lights in the HD map. In Japanese traffic environment, arrow lights are generally lit at red and yellow TLs. After detecting a yellow or red signal, an arrow detection ROI is determined as shown in Figure 9. In the recognition process, the right-arrow detector is trained in advance using AdaBoost (cascade detector in OpenCV), and then it is applied to the extracted ROI. In order to detect left/straight arrows, the ROI image is rotated and the same detector is used to search objects.

**Figure 9.** Arrow light recognition.

#### *3.10. Prior Information using Digital Map*

By using the proposed method described above, the TL recognition is realized by detecting the candidate objects, classifying the lighting color, and computing the confidence using the existence probability. This work further improves the recognition performance, especially for distant arrow lights, by utilizing the prior information given in the digital map.

In the TL recognition, when there are multiple candidate objects, it is possible to weight candidates according to the distance of the TLs in the probability updating procedure. It is expected to reduce false-positive detections in background.

On the other hand, in arrow recognition, recognition can be improved by providing the pattern of the target TL from Figure 2 as prior information. For example, Figure 10 illustrates the typical arrow recognition scene. In the recognition of a distant arrow light, if it is difficult to visually recognize the direction of the lighting arrow, it may cause false-positives or false-negatives. In Figure 10, it can be seen that some arrow lights are lit in the ROI image, but it is difficult to distinguish the directions. In this case, because the lighting parts of the arrow light is crushed, a candidate point may be detected

at the arrow TL as well as the candidate TL. Normally, this detected candidate can be a false-positive detection of a green signal. However, if information on the relative positional relationship of the arrow lights at the TL is provided as a prior information, it is possible to distinguish the direction of the arrow lights. Our work evaluates how this prior information contributes to the recognition of TLs and arrow lights by the proposed method.

**Figure 10.** Arrow light recognition using prior information.

#### **4. Evaluations**

#### *4.1. Condition*

Experiments have been carried out to evaluate the effectiveness of the proposed method with actual driving data. Some driving data have been collected using the automated vehicle owned by our group. Figure 11 shows our automated vehicle. This automated vehicle was equipped with some sensors such as LiDAR, MWR, GNSS/INS and camera to observe its surroundings. A 3-D LiDAR Velodyne HDL-64E S2 with 64 separate beams was mounted on the vehicle to take measurements of the environment. It measured the 3-D omnidirectional distance at a frequency of 10 Hz. An Applanix POS/LV220 coupled GNSS and INS was mounted on the vehicle. It provided an accurate position (latitude, longitude and altitude) and orientation (pitch, yaw, roll) at 100 Hz. In addition, in order to observe the traffic lights, the vehicle was equipped with a mono-camera Pointgrey Flea2, which provided a 1280 × 960 pixel resolution at 7.5 Hz. There were lamp type and LED type TLs to be recognized. Because the LED type TLs blink at high speeds, the TL may have been turned off when shooting with the camera, depending on the shutter speed. To avoid this problem, the shutter speed was limited from 10 ms to 20 ms for the auto-exposure function. As a result, the image of a dark scene such as the evening was almost dark as shown in Figure 1b and the shape of the traffic light could not be seen.

**Figure 11.** Experimental vehicle.

This vehicle had various functions necessary to enable automated driving in an urban area, and has actually been running in Japan for several years. In previous works, real-time localization

algorithms have been developed using different types of sensors such as 3-D LiDARs, cameras or MWRs [7,29,30]. Therefore, it is assumed that the precise vehicle pose has been estimated using such localization algorithms in this evaluation.

Table 2 shows the number of images in the train and test dataset. These datasets were recorded at Kanazawa-city and Suzu-city in Ishikawa, Japan. As described in Section 3.1, the TL recognition aims to recognize a total of six states, three states of TLs (green, yellow, and red) and three states of arrow TLs (left, straight, and right). The training data were used to train the overall shape of the traffic light and the arrow light detector using machine learning. These data consisted of images measured during the daytime as visible data of the overall shape of the TL. On the other hand, the test dataset consisted of not only the daytime scene, but also the dark scene which was images measured in the early morning and evening hours. In addition, Figure 12 shows the frequency distribution of test data for distances from 30 m to 150 m. Although the ratio of the arrow TL data was small overall, the test data were distributed almost uniformly from a short distance to a long distance.

**Table 2.** Experimental conditions: number of data.

**Figure 12.** Histogram of number of test data in different distances from the TL.

In evaluating the performance of the proposed method, it was important to objectively compare to the existing methods. However, it was difficult to directly compare the reported performance of other works due to different types of cameras, and sensors, and different experimental conditions such as driving area, and weather conditions. Therefore, in addition to the evaluation based on the recognition distance, the recognition performance for objects with the similar pixel size as other works was also evaluated. Table 3 summarizes the characteristics of the pixel size of bounding boxes in the existing data set and the test data. The table indicates that the test data in this work were challenging data that included objects with large and small pixel sizes for traffic lights and arrow lights, even compared to existing public data sets.


**Table 3.** Bounding box pixel size in test dataset and other open dataset.

The evaluations carried out in this work are summarize bellow:


YOLOv3 was adopted as one of the state-of-the-art methods because it is a widely used DNN in object detection. The ML detector and the arrow detector (AdaBoost) of the proposed method and YOLOv3 were generated using the training data in Table 2. As described above, AdaBoost detectors were divided into two types (the TL detector and the arrow TL detector). The YOLOv3 model was generated as a model that recognizes six classes of TLs and arrow TLs. In the evaluation of the recognition rate, the data set was divided into intervals of 10 m, and precisions, recalls, and f-values for the data in each interval was used as an evaluation measure. The evaluation metrics were calculated by the following equations.

$$Precision = \frac{TP}{TP + FP} \tag{18}$$

$$Recall = \frac{TP}{TP + FN} \tag{19}$$

*F-value* <sup>=</sup> <sup>2</sup>*Recall* · *Precision Recall* <sup>+</sup> *Precision*, (20)

where *TP* is the number of true-positives, *FP* is the number of false-positives, and *FN* is the number of false-negatives. The *TL* and the arrow *TL* were evaluated separately due to the difficulty in recognizing and the different number of data.

The relevant parameters of the proposed method were set as follows. *dT* = 150 m, *θ<sup>T</sup>* = 30 deg, *kroi* = 3.0,*ss* = 1.2 m, *sl* = 0.4 m, *Hmin* = 20000, *kmax* = 0.5, *kmin* = 2.5, *α* = 0.8. The computer used for the evaluation was a Windows 10 desktop PC, the CPU was Intel Xeon CPU E5-1620v4 (3.50 GHz), the memory was 16 GB, and the GPU was NVIDIA GeForce GTX TITAN X. The processing on the CPU was operated in a single thread.

#### *4.2. Results*

Figure 13 and Table 4 show the experimental results using the proposed method and YOLOv3 for the whole test data. Figure 13 indicates the recognition rate for each interval of the TLs and arrow TL, and Table 4 indicates the average of precision, recall and F-values obtained in each interval. In the TL recognition, the recognition rate of the proposed method was more than 90% even at approximately 100 m. Although the recognition rate decreased as the distance increased, it was

confirmed that 80% or more could be maintained even 150 m away. Comparing the effects with and without spatial prior information, the precision, recall, and F-values were slightly improved by using spatial prior information. Therefore, a slight improvement of false-positive and false-negative detections was confirmed using spatial prior information. On the other hand, in YOLOv3, it was confirmed that a similar level of recognition rate was obtained at short distances of less than 40 m. However, the reduction of the recognition rate became larger as the distance increased compared to the proposed method. Comparing the recognition rates of the arrow TLs, it was confirmed that there was a large difference between the proposed method and other methods. Here, Figure 14 shows the average number of pixels of the lighting area in the TLs at each distance. In the object detection using machine learning, it was confirmed that the recognition rate was extremely low for arrow TLs smaller than 10 pixels. However, in the proposed method, it was found that the recognition rate at around 100 m could be greatly improved by suppressing the performance degradation. Therefore, it was shown that the proposed method, which uses the relative position between the TL and the arrow TL as prior information, can improve the recognition distance by 10–20m.


**Figure 13.** F values for whole data: with or without the prior information.

Next, we objectively evaluate the recognition results of this study against the performance reported in existing works. de Charatte reported a precision of 95.38% and a recall of 98.41% (37.4 ms per frame) as a recognition performance of LARA, a French dataset [13]. In addition, Chen reported a recognition rate of 95.75% (3 Hz per frame) for WPI, a USA dataset [14]. According to Table 3 and Figure 14, in our test data, the data within the range of 120 m (6 pixels or more) correspond to the difficulty of LARA, and the data within 60 m (11 pixels or more) correspond to the difficulty of WPI. Table 5 summarizes the recognition rates of the proposed method for each range of test data. The evaluation results for our test data indicates that the precision value was 97.1% for data within 60 m and 95.7% for data within 120 m. Although there were differences due to the different driving environment, it was shown that approximately the similar precision was obtained. In particular, it has been reported that the processing time of 3 Hz is required for the method [14] that can recognize both TLs and arrow

TLs using the PCANet [15] which is a compact DNN model. On the other hand, in the proposed method, it was possible to simplify the arrow detector model by using prior information, and then the average processing time was 67 ms. Therefore, the proposed method achieved a recognition rate similar to SoAin a compact processing.

**Figure 14.** Average pixel size of the lighting area at different distances.



In order to detect candidate objects, the proposed method combined three kinds of methods: circular features by SURF, blob features by binarized images, and features of traffic signal shape by AdaBoost detector. To evaluate the contribution of each method, the obtained recognition rates were compared with the results obtained when each detection method was used alone. Figures 15–17 show the obtained recognition rates for each detection method. These graphs indicate the recognition rate for all test data, the daytime scene data, and the dark scene data, respectively. Table 6 summarizes the average precision, recall, and F-value for the recognition result under each condition. From these results, it was confirmed that the method of circular extraction by SURF showed almost the same level of performance as the proposed method. Introducing the proposed method improved the average recognition rate by approximately 0.4%.

Although the recognition rate of the proposed method and SURF were almost identical, a detailed analysis was performed to verify the superiority of the proposed method. Figure 18 shows a graph summarizing the different precision and recall values between both methods. In this figure, it can be confirmed that the proposed method is superior for short-ranges recall and long-ranges precision values. This means that false-negative rate at short-range and false-positive rate at long-range have been improved. Circular extraction by SURF enables appropriate detection by obtaining a feature image in which the lighting area is sufficiently enhanced. However, in the case of a lamp-type traffic light with weak brightness, the emphasis in the feature image may not be sufficient, and a false-negative detection occurs. In such a case, if the overall shape of the TL is visible, improvement of the false-negative detection is expected by recognizing the TL with the ML-detector. On the other hand, in the case of distant recognition, it is achieved that feature points of blobs as well as circular features of SURF are detected simultaneously. As a result, the recognition rate was improved because the false detection of the background and the true detection points could be differentiated for the obtained

candidates. Thus, although the overall improvement rate is small, the introduction of the proposed method improved the recognition rate in specific situations and thus it proves to be a robust method.

**Figure 15.** F values for whole data: different candidate extraction methods.


**Table 6.** Experimental results: different candidate extraction methods.

**Figure 16.** F values for daytime data: different candidate extraction methods.

**Figure 17.** F values for dark data: different candidate extraction methods.

**Figure 18.** Difference value of precision/recall between the proposed method and sped up robust features (SURF).

Figure 19 shows typical scenes of true-positives for small arrow detection. By utilizing prior information, the proposed method can detect small arrow TLs of 10 pixels or less. As shown in Figure 19, it was confirmed that an arrow of 5 pixels could be recognized. However, according to the evaluated results, there were difficult scenes where false-negatives occurred. Figure 20 shows typical false-negative images. On suburban roads, old ramp signals are installed. Such traffic lights have very low brightness, and in some cases there are images where it is difficult for humans to see the lighting color. In such a situation, even if the overall shape of the TL could be detected, it was false-negative because the lighting color was not bright enough.

Finally, the processing time for each method has been evaluated. Table 7 shows the average processing time and standard deviation of each method. In this experiment, the images have been captured at 7.5 Hz Then, if that could be processed within 133 ms, it would be a method that can be said to operate in real-time. According to Table 7, the real-time operation is possible because the average processing time of all methods was within 100 ms. However, while YOLOv3 required processing on the GPU, other methods, including the proposed method, could be operated with only the CPU. Therefore the proposed method was practically easier. Automated vehicles are required to process many recognition and decision-making processes in a limited computer environment. In this respect, usefulness was shown by analyzing the recognition performance and processing performance of the

proposed method. According to Table 7, there were cases where the proposed method took about 100 ms instantaneously. There was no critical problem on the decision of approaching the intersection, because the moving amount of the vehicle during that time was about 1 or 2 m. The validity of the TL recognition during automated driving was evaluated in the next section.

**Figure 19.** Typical improvement for far arrow TLs.

(a) distant TL (Green, 150m, 5x12px) (b) near TL (Green, 12m, 68x200px)

**Figure 20.** Typical false-negative situations due to low brightness of lamp-type TLs.

**Table 7.** Experimental results: computational time.


#### *4.3. Verification for Automated Driving*

We showed the superiority of the proposed method by evaluating a recognition distance for TLs and arrow TLs. Next, additional verification was carried out to determine if the proposed method has the required performance in actual automated driving. In the verification experiment, automated driving was performed in the Odaiba area of Tokyo as shown in Figure 21. In the automated driving, the role of the TL recognition is to determine the intersection approach following the traffic rules. If the signal state at the intersection is not properly recognized, the decision to enter the intersection will be incorrect and unnatural acceleration/deceleration will occur. In order to evaluate the effects of such a situation, it is necessary to investigate the stability of the TL recognition results and the vehicle acceleration when the automated vehicle is driving at the maximum velocity (the speed limit). Therefore, the transition of velocity and acceleration when passing through an intersection is verified while recognizing TLs.

**Figure 21.** Driving routes for verification experiments.

Driving data were measured on the following two types of travel routes.


In both routes, there was no preceding vehicle, because the automated vehicle should drive at the maximum velocity. Then it was verified whether the vehicle could drive smoothly while properly recognizing each TL. In this verification experiment, a camera different from Section 4.1 was used owing to a hardware issues. A camera with a resolution of 1920 × 1080 was installed in this experiment. It has a similar vertical field of view compared to the camera described in Section 4.1. Therefore, the recognition distance was extended by the ratio of the resolution ratio compared to the recognition distance in Section 4.2. Figures 22 and 23 show the velocity and acceleration of each driving data and the TL state recognized at each intersection as verification results. The bar graph shows the TL status recognized at each time, the color of the graph is the light color, and the numerical value below it is the distance to the corresponding TL. According to Figure 22, it can be confirmed that the TL state was recognized immediately within 150 m of the recognition range. At the intersections C and E, the vehicle stopped at a red light. In this case, the vehicle stopped smoothly with moderate deceleration. On the other hand, Figure 23 shows a driving scene passing through an intersection while recognizing an arrow light in the straight direction. As a result of the evaluation in Section 4.2, the recognition performance of the arrow light was deteriorated for distant objects. Therefore, the recognition result at a point 100 m away at the intersection G was unstable. From the experimental results, it can be confirmed that the recognition result became stable at a point 85 m away. Even in such a situation, it was shown that unnecessary deceleration did not occur according to the transition of speed and acceleration. Thus, it was shown that the recognition result obtained by the proposed method had the necessary performance to drive smoothly at intersections in the urban areas.

In addition, in our group's work, demonstration experiments of automated driving have been conducted in the area shown in Figure 21 since September 2019. In the half-year, there were no critical problems regarding the recognition distance and stability of the TL recognition. In such respect, the validity of the proposed method was qualitatively evaluated.

**Figure 22.** Verification results for automated driving from the intersection A to E.

**Figure 23.** Verification results for automated driving from the intersection F to G.

#### **5. Conclusions**

This work has proposed a TL recognition algorithm for urban automated driving. We prepared the challenging dataset that includes objects with large and small pixel sizes of objects for traffic lights and arrow lights. The proposed method can be processed in real time by the CPU, and our work verified that the proposed method can recognize TLs within 150 m with an F-value of 91.8%. This f-value is the recognition rate in one frame. When approaching an intersection from a distance of 150 m, recognition process of about 100 frames is performed, then the state of the intersection can be estimated with high confidence. The evaluations verify the following as the contributions of the work.


In the arrow recognition by the proposed method, the arrangement pattern of the signal light and the arrow light given in Figure 2 was used as prior information to improve the recognition rate for distant objects. The map information corresponding to such prior information can be used practically, because it is included as static information of the HD map that has already been studied [31]. On the other hand, the evaluation in this experiment showed a recognition rate of more than 90%, but there were cases where recognition became difficult. In addition to Figure 20, there are cases where it is relatively difficult to determine the lighting color of the lamp due to the influence of the surrounding brightness. For example, in the case of receiving the sunlight from behind the vehicle, all lamps may appear to be lit. Moreover, there are cases where yellow and red lighting colors can be visually recognized to the same color in the images. These cases will be a false-positive detection. In addition to the issues of software based recognition algorithms, but also the performance limit of hardware needs to be discussed from a practical point of view. As described in [32], there are situations in which it is difficult to view the TLs in the image in severe sunshine as a hardware performance limit. It is desirable to develop a practical recognition methods while discussing such software and hardware issues.

**Author Contributions:** Conceptualization, and methodology, K.Y., A.K. and N.S.; software and validation, K.Y., A.K. and T.A.; data collection, K.Y., A.K., N.S. and R.Y.; data analysis, K.Y., A.K., T.A. and M.A.; writing—original draft preparation, K.Y.; writing—review and editing, K.Y. and A.K.; supervision, N.S.; project administration, K.Y.; funding acquisition, N.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by New Energy and Industrial Technology Development Organization (NEDO).

**Acknowledgments:** This work was supported by Council for Science, Technology and Innovation(CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP), "Research and development for recognition technologies and other technologies necessary for automated driving technologies(levels 3 and 4)" (funded by NEDO).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **A Repeated Game Freeway Lane Changing Model**

#### **Kyungwon Kang and Hesham A. Rakha \***

Center for Sustainable Mobility, Virginia Tech Transportation Institute, Blacksburg, VA 24061, USA; kwkang@vt.edu

**\*** Correspondence: hrakha@vt.edu; Tel.: +1-540-231-1505

Received: 29 January 2020; Accepted: 5 March 2020; Published: 11 March 2020

**Abstract:** Lane changes are complex safety- and throughput-critical driver actions. Most lane-changing models deal with lane-changing maneuvers solely from the merging driver's standpoint and thus ignore driver interaction. To overcome this shortcoming, we develop a game-theoretical decision-making model and validate the model using empirical merging maneuver data at a freeway on-ramp. Specifically, this paper advances our repeated game model by using updated payoff functions. Validation results using the Next Generation SIMulation (NGSIM) empirical data show that the developed game-theoretical model provides better prediction accuracy compared to previous work, giving correct predictions approximately 86% of the time. In addition, a sensitivity analysis demonstrates the rationality of the model and its sensitivity to variations in various factors. To provide evidence of the benefits of the repeated game approach, which takes into account previous decision-making results, a case study is conducted using an agent-based simulation model. The proposed repeated game model produces superior performance to a one-shot game model when simulating actual freeway merging behaviors. Finally, this lane change model, which captures the collective decision-making between human drivers, can be used to develop automated vehicle driving strategies.

**Keywords:** lane-changing; merging maneuvers; game theory; decision-making; intelligent vehicles

#### **1. Introduction**

Driving behavior strongly affects the safety and throughput of the transportation system [1], Due to its interference with surrounding vehicles, lane-changing significantly affects traffic stream flow. Several studies have concluded that lane-changing produces a capacity drop, forming a bottleneck [2–4]. The impacts of lane-changing maneuvers have been modeled in several studies [5–8]. In particular, Liu et al. [9] argued that traffic conflicts between merging and through vehicles, which are common near freeway on-ramps, are notable for inducing shockwaves, resulting in congestion. In order to analyze traffic flow, therefore, the development of a state-of-the-art lane-changing model is important.

The applications of lane-changing models can be broadly classified into two groups: adaptive cruise control and microscopic traffic simulation [1]. Driving assistance models for adaptive cruise control consist of collision prevention models and automation models [10]. In addition, driving decision models focus on drivers' lane-changing decisions for different traffic conditions and for different situational and environmental characteristics [10]. Lane-changing models were proposed based on various methodologies, which are reviewed in the next section, and calibrated based on field data collected on freeways. These models are an important component of microscopic traffic simulation [11]. Most models, however, focus on only the lane-changing vehicle in decision-making and vehicle control, which could be detrimental in microscopic traffic simulation, as interaction with surrounding vehicles is also critical in lane-changing. Specifically, drivers of vehicles surrounding the lane-changing vehicle, especially the closest following vehicle in the target lane, react after recognizing a lane-changing vehicle's intention to change lanes. For example, a human driver will sometimes not allow a lane change. Even though this type of competitive lane-changing behavior is rarely observed, decision-making considering drivers' interaction when changing lanes should be studied in order to develop a precise lane-changing model.

In addition, modeling a driving strategy for automated vehicles (AVs) gives rise to a new application for lane-changing models. The introduction of AVs onto the roadway means that reasonable lane-changing decision-making can be conducted by an intelligent robot or a well-programmed machine. During the transition to fully autonomous transportation systems, harmonization with human drivers will be necessary for the operation of AVs. Therefore, the development of a realistic lane-changing model that can depict human drivers' decision-making is also required to enhance AVs' driving performance.

To model lane-changing behaviors considering realistic decision-making, we developed a gametheoretical decision-making model for merging maneuvers at a freeway on-ramp [12], and then proposed a repeated game model [13]. This paper enhances our repeated game lane-changing model proposed in [13] and evaluates the proposed model's performance. The paper begins by introducing the lane-changing models based on various methodologies, including a game theoretical approach. To enhance model efficiency and complement the multivariate function in the previous model, the payoff functions for a stage game are reformulated in Section 3. This study also applies the repeated game approach, which uses cumulative payoffs, in order to capture realistic human driver behavior at a freeway merging section. Both the repeated game model and the one-shot game model based on the reformed stage game are calibrated and validated using empirical data extracted from the Next Generation SIMulation (NGSIM) dataset [14,15] to demonstrate the prediction ability. In the rest of this paper, we present a sensitivity analysis to describe the stage game's efficiency. The simulation case study using an agent-based model (ABM) follows. Finally, we draw concluding remarks on this work, and point out areas of potential future research.

#### **2. Literature Review**

A comprehensive literature review is required to introduce previous research efforts and present the motivations for this study. This section begins with a review of lane-changing models, focusing on methodologies. Then, game theory-based models are introduced in detail. Based upon the literature review, the motivations for the study are presented.

#### *2.1. Lane-Changing Decision-Making Models*

In general, the lane-changing process can be categorized as a sequence of four steps: (1) checking for lane-change necessity, (2) lane selection to decide on a target lane, (3) gap choice in the target lane, and (4) lane-changing execution through gap acceptance. To model lane-changing behaviors, lane-changing models have been developed using various methodologies that can be grouped into four types: (1) rule-based models, (2) discrete-choice-based models, (3) artificial intelligence models, and (4) incentive-based models [1].

The first model type, the rule-based model, is one of the most popular driver-perspective-based methodologies [1]. Drivers' decisions in the lane-changing process are simply defined as the independent variable. Gipps [16] initially introduced a lane-changing model covering various urban driving situations, which was intended for microscopic traffic simulation tools [17]. Gipps' model represented the lane-changing process as a decision tree with a series of fixed conditions, where the final output of this rule-based triggered event is a binary choice (i.e., change or no change) [1]. The CORridor SIMulation (CORSIM) model classified lane changes into two types: (1) discretionary lane-changing (DLC), which occurs when a driver is unsatisfied with the driving situation in their current lane, while the target lane shows better driving conditions; (2) mandatory lane-changing (MLC), which is coercively required according to the route choice (i.e., lane change toward on-ramp or off-ramp) [18,19]. Rahman et al. [1] categorized the game theory-based model, which explains lane-changing when a traffic conflict arises between the merging vehicle and the closest following vehicle in the target lane, as a rule-based model. Game theory, which is used in this paper, is the study of mathematical models of conflict and cooperation between decision-makers [20]. It focuses on decision-making in consideration of the

interaction between intelligent drivers. Using a game theoretical approach is advantageous in that it takes into account the behaviors of the following vehicle driver in the target lane, while the other approaches introduced above focus only on the lane-changing vehicle driver's decision.

The second model type, the discrete-choice model, relies on a logit or probit model to describe lane-changing maneuvers. Lane-changing is decided based on probabilistic results instead of binary answers. Ahmed [21] modeled lane-changing motivation (i.e., trigger to change a lane), target lane choice, and gap acceptance, presenting three categories of lane-changing: DLC, MLC, and forced merging (FM), in which a gap is not sufficient but a driver nonetheless executes a lane-changing maneuver in heavily congested traffic conditions. Ahmed [21] assumed that critical gaps follow a lognormal distribution to guarantee that they are nonnegative. Toledo et al. [22] developed a probabilistic lane-changing decision model by combining MLC and DLC through a single utility function. Both models developed by Ahmed [21] and Toledo et al. [22] considered drivers' heterogeneity, such as aggressiveness and driving skill level, using a random term as one of the explanatory variables.

The third model type, which includes fuzzy models and artificial neural network (ANN) models, is artificial intelligence models. The fuzzy model considers humans' imprecise perception and decision biases, and incorporates more variables than the common mathematical models [23]. However, the fuzzy model has disadvantages, such as unexpected difficulties and complexity in the fuzzy rules [23]. The ANN model processes information using functional architecture and mathematical models that are similar to the neuron structure of the human brain [1]. Hunt and Lyons [24] modeled the lane-changing decisions of drivers on dual carriageways. Since the neural network model is completely data-driven and requires field-collected traffic data, Hunt and Lyons used interactive driving simulation to train the model. As this example shows, one major disadvantage of the ANN model is that it requires a huge amount of data to be optimized and also requires a training period.

The last type of model, the incentive-based model, models lane-changing desire utilizing the defined incentive. In other words, this model assumes that a driver chooses to change lanes in order to maximize their benefits [1]. The minimizing overall braking induced by lane change (MOBIL) model, which was developed in Kesting et al. [11], is based on measuring both the attractiveness and the risk associated with lane changes in terms of acceleration. Therefore, both the incentive criterion and the safety constraint are formed using the acceleration function of the underlying car-following model. In addition, the model attempts to capture the degree of passive cooperation among drivers, using the politeness factor as a weight on the term for total advantage of the surrounding vehicles.

#### *2.2. Game Theory-Based Lane-Changing Decision-Making Model*

It is clear that lane-changing involves not only a driver of the subject vehicle (SV), who is motivated to change lanes, but also a driver of the lag vehicle (LV) in the target lane, who controls their own vehicle (i.e., the LV) after perceiving the lane-changing vehicle in the adjacent lane. Specifically, the driver of the SV controls their longitudinal and lateral movements to safely change lane in consideration of surrounding vehicles, and the driver of the LV responds by showing acceptance or non-acceptance of an SV's lane-changing intention. This decision-making process involving both drivers motivated previous studies to use a game theoretical approach. Game-theory-based models, therefore, were modeled as a two-player non-cooperative game.

Kita [25] modeled merging-giveway interaction between vehicles in a merging section based on a game theoretical approach. The action strategies of the driver of SV are merging or maintaining the current lane, while the strategies of the driver of LV in the target lane are giving way (i.e., yielding) or not. Kita [25] modeled interaction between drivers as a game under perfect information conditions. However, perfect information in game theory indicates that all players have perfect and instantaneous knowledge of their own utility and the events that have previously occurred. In a traditional transportation environment, in which a driver becomes aware of their surroundings through sight only, this assumption is irrational. Additionally, Kita's model assumed that vehicle speeds were constant during the merging process, which is likewise unrealistic [9].

Liu et al. [9] modeled merging and yielding behavior using modeled payoff functions about the drivers' objectives. In Liu et al. [9], the objective of the driver of SV is to minimize the time spent in an acceleration lane subject to safety constraints, while the objective of the driver of LV is to minimize speed variation. The payoffs of drivers of the SV and LV were formulated using acceleration level and time that the merging vehicle spends in the acceleration lane for each action strategy, respectively. However, the driver of SV occasionally showed different behaviors, which were assumed to be based on the objective of the driver of SV. Kondyli and Elefteriadou [26] found that all drivers want to reach a speed close to the freeway speed or the speed limit, if there is no lead vehicle. This speed synchronization process that causes drivers to accelerate when arriving at the beginning of an acceleration lane was observed at a merging section on a freeway [27]. To solve the game, Liu et al. [9] proposed a bi-level calibration framework, in which the upper level programming is an ordinary least square problem and the lower level programming is a linear complementarity problem, for finding the Nash equilibrium.

In [12], we modeled a decision-making game model for merging maneuvers using five decision factors and evaluated the proposed model using NGSIM data. In addition, we introduced a repeated game approach in order to avoid an instantaneous fluctuation in decisions in microscopic simulation [13]. Even though these models showed high prediction accuracy, there were limitations, namely that the number of data showing all action strategies sets was unbalanced due to data collection during the morning peak time, and the model validation results were unable to show the distinct performance of the repeated game approach in microscopic simulation.

The development of advanced vehicle technologies (e.g., vehicle-to-vehicle communication) and AVs, has led recent research efforts to focus on the cooperative interaction between vehicles [28,29]. Talebpour et al. [29], for instance, modeled both mandatory and discretionary lane-changing by applying the Harsanyi transformation [30] within a connected environment. Yu et al. [31] designed a human-like, game theory-based controller for AVs in consideration of mixed traffic.

#### *2.3. Motivation and Contribution of the Paper*

The following are the contributions of this paper. First, we enhance the payoff functions that were previously developed in [12,13] by taking into consideration multiple decision factors and normalizing the decision variables. Multivariate functions using variables, which have different units, may induce a trivial equilibrium solution when variables are correlated. To solve this issue, we reformulated the payoff function by considering dimensionless variables. Second, we validate and compare the previous and proposed models. Third, we conduct a sensitivity analysis of the proposed model performance. Fourth, we demonstrate the benefits of a repeated-game approach using a simulation tool. The repeated game model first introduced in [13], in which a stage game is repeatedly played taking into consideration previous game results, showed no evidence of benefits compared to a one-shot game model, played independently based on instantaneous data at every decision point. If there is competition between drivers due to an ambiguous merging situation—for example, not only small lag spacings but also similar vehicle speeds—the one-shot game model may be sensitive to instantaneous data, causing fluctuations in driver decisions during the decision-making process. On the other hand, the repeated game model's initial cooperative decision can be expected to remain the same when there is only a slight variation in payoffs. Furthermore, the game model can produce a change from a non-cooperative to a cooperative game. Even though this type of driver competition in merging seldom occurs, the robust game model can be integrated into a microscopic traffic simulation software in order to simulate stereotypical vehicle movement patterns. Consequently, in this study we adopt the previous repeated game approach with enhancements in the payoff function and then provide evidence of the repeated game model's benefits through a case study.

Lastly, a desired acceleration level, which is calculated to achieve the action set chosen by both players, should be an additional component of a vehicle acceleration model. A lane-changing model based on a game theoretical approach captures the decision-making process between two intelligent decision-makers. The model output is an action that will be conducted by two players at future time steps, rather than a decision to start lane-changing. To depict practical lane-changing behaviors in a microscopic traffic simulator, therefore, the game model should be integrated with other models, such as car-following, lane selection, and gap acceptance models. This study develops an A simulation model based on ABM, including a vehicle acceleration controller based on the game model and a car-following model, then conducts a simulation study to evaluate the performance of the repeated game model.

#### **3. Merging Decision-Making Model Using a Repeated Game Concept**

As previously noted, this study aims at developing a decision-making game for merging maneuvers on a freeway based on the repeated game concept. The following subsections describe, in detail, a stage game for merging decision-making and repeated game design and the development of the player payoff functions.

#### *3.1. Stage Game Design*

The game model defines the number of players, action strategies of each player, and corresponding payoff functions to describe the outcome for each player throughout the game [32]. This study adopts the decision-making game model structure for merging maneuvers proposed by the authors in 2017, which consists of two players: the drivers of the SV and the LV. The driver of SV, who wants to make a lane change, has three action strategies (see Figure 1a): (1) change lane (*s*1), (2) wait for the LV's overtaking maneuver (*s*2), or (3) overtake the LV and use a forward gap to merge (*s*3). The opposite player, the driver of LV, has two action strategies (see Figure 1b): (1) yield to allow the lane change maneuver of the driver of SV (*l*1) or (2) block the SV's merging maneuver by decreasing the spacing available for the SV (*l*2) [12]. In real life situations, the driver of LV can choose lane-changing to the left lane to avoid potential collision or considerable deceleration [33], and this lane-changing behavior was considered as an action strategy of the driver of LV in [29]. Freeway vehicles on the rightmost lane generally change lanes from the rightmost lane upstream of the merging section after perceiving the approach of the merging vehicle in order to maintain their speed. Since this mainline vehicle's lane change is conducted earlier and thus does not involve interaction with the merging vehicle, this study does not include a lane-changing action as one of the actions of the driver of the LV in the proposed merging game.

Let *S* = {*s*1,*s*2,*s*3} and *L* = {*l*1, *l*2} denote the set of pure strategies for the drivers of the SV and LV, respectively. In addition, *<sup>a</sup>* = *si*, *lj* denotes a set of actions (*a* ∈ *S* × *L*) where *i* and *j* indicate the index of action strategies of the drivers of the SV and LV (i.e., *i* = 1, 2, 3 and *j* = 1, 2). As such, a total of six sets of action strategies were defined for the non-cooperative decision-making stage game. In these action strategies, (*s*1, *l*1), (*s*2, *l*2), and (*s*3, *l*1) are cooperative action strategies, whereas both (*s*1, *l*2) and (*s*2, *l*1) are non-cooperative strategies in which both players compete to achieve their objectives. The action strategy (*s*3, *l*2) is neither cooperative nor competitive. The proposed stage game with imperfect information, which captures the fact that players are simply unaware of the actions chosen by other players, is represented in Figure 2. In the figure, a dashed line uniting three nodes, which implies imperfect information, indicates that the players do not know which node they are in. This means that there is no sequence in making a decision, and thus the driver of LV does not know the SV's movement. Moreover, *Pij* and *Qij* denote the payoff for the drivers of the SV and LV for each action strategy *aij*, respectively.

The drivers initially play the stage game to decide on an individual action at the moment when an SV, an LV, and a preceding (lead) vehicle (PV) are identified ([12]). It was assumed that the initial game is played when the driver of the SV reaches the start of an acceleration lane. Additional stage games are formed by overtaking the PV or waiting to be overtaken by the LV. In other words, the stage game is re-built when a change in surrounding vehicles occurs, i.e., PV or LV, in the target lane.

**Figure 1.** Players' strategies for merging maneuver: (**a**) the driver of subject vehicle (SV); (**b**) the driver of lag vehicle (LV).

**Figure 2.** Merging decision-making game in the extensive form.

#### *3.2. Repeated Game Design*

In the game model, one of the characteristics to be specified is the number of games to be repeated [25]. In the authors' previous study, a repeated game approach was used in order to depict a practical decision-making process for merging maneuvers [13]. In real life, at a freeway merging section in a traditional transportation environment, a driver continuously makes a decision using the data taken in by sight and controls the vehicle to fulfill their decision. When the merging vehicle enters the acceleration lane, the driver of the SV selects a gap type to change a lane and then directs their vehicle accordingly. The driver controls the acceleration level to synchronize the vehicle speed with the freeway vehicles and ensure a safe gap distance [27,33]. During this lane-changing preparation process, the driver of SV repeatedly checks surroundings to judge if their decision can be fulfilled and tries to follow-up on their decision. In this study, therefore, this repetition in decision-making for merging maneuvers prior to lane-changing execution was regarded as playing the game repeatedly.

The repeated game concept implies that a stage game with identical structure is repeatedly played until termination of the game, which is divided into two classes, finite or infinite, depending on the players' beliefs about the number of repetitions. In this study, the decision-making game for merging was regarded as an infinitely repeated game because the players in the game do not know how many times the game will be repeated. Note that, for an infinitely repeated game, the stage game will not necessarily be repeated an infinite number of times.

Drivers (i.e., players) interact by playing a stage game multiple times. As a summary explanation about the game model type, the one-shot game model implies that previous game results do not affect the present game, while the decision-makers take previous game results into account in the repeated game model, as illustrated in Figure 3. This study adopts the repeated decision-making game approach using the cumulative payoffs to prevent repeated fluctuations in payoffs, as proposed in [13]. The stage decision-making game is conducted periodically and repeatedly over discrete time periods *T* ∈ [*t*1, *tn*]. Time preference is considered by assuming that future payoffs are weighted proportionately at a constant rate δ, called the rate factor. Cumulative payoffs of the driver *d* for action strategy *aij*, i.e., *U<sup>d</sup> ij* = *Pij* or *Qij*, are presented in Equation (1).

$$
\mathcal{U}\_{ij}^d(T) = \sum\_{t\_1}^{t\_0} \delta^{t-1} u\_{ij}^d(t). \tag{1}
$$

Here, *ud ij*(*t*) is a utility of a driver *d* for an action strategy set (*si*, *lj*) at time step *t*; *T* is the number of decision-making time steps; and *d* denotes a driver, i.e., a player in a game, the driver of SV or DL. If δ > 1, it implies that the current payoffs are more important than the past payoffs. Otherwise, the previous game results could significantly affect the decision-making in a future game.

**Figure 3.** Decision-making game based on the repeated game approach in extensive form.

#### *3.3. Reformulated Payo*ff *Functions*

In previous game theory-based-models, the payoff functions for two players were formulated using the significant decision factors, such as safety, spacing (or gap), relative speed, travel time, expected acceleration level, and remaining distance to reach the end of acceleration lane [11–13,25,29,31]. In [12], we initially proposed the payoffs using five decision factors: minimization of travel time, avoidance of collisions (i.e., safety), travel efficiency, the LV's expected acceleration, and the remaining distance to execute the maneuver. In a following study [13], the payoffs of the driver of SV were formulated as the expected gap and remaining distance, and the expected relative speed was considered as the other driver's main decision variable. Both previous studies used multiple dimensioned variables, meaning the payoffs are only interpreted as a qualitative outcome to represent the player's preference. In addition, an error term was considered to capture unobserved variables, assumed to be a constant, resulting in minimal consideration of a driver's randomness. As described previously, therefore, this study updates the payoff functions to use efficient decision variables including a random error term and proposes monotone (dimensionless) functions by the transformation of quantitative variables. This section introduces the decision variables, and then presents the reformulated payoff functions for each driver.

#### 3.3.1. Safety Payoff

Among various decision factors, safety is a key factor for human drivers' decision to avoid a potential collision or not induce a dangerous situation. Yu et al. [31] used the time headway as a safety payoff, as presented in Equation (2).

$$h\_{PV,SV}(t) = \frac{\mathbf{x}\_{PV}(t) - \mathbf{x}\_{SV}(t)}{v\_{SV}(t)},\tag{2}$$

Here, *xPV*(*t*) and *xSV*(*t*) are the positions of the (potential) PV and SV at instant time *t*, respectively; and *vSV*(*t*) is speed of the SV at time *t*. However, they did not take the speed of a PV into account. In [13], the expected spacing between vehicles, indicating the possibility of ensuring a safe distance with consideration of vehicles' speed and acceleration levels, was used. Additionally, Wang et al. [34] used a penalty formulated using relative speed and the gap distance. Kita [25] used the Time-To-Collision (TTC) between vehicles as the main payoff, as defined in Equation (3).

$$TT\mathbb{C}\_{PV,SV}(t) = \frac{\mathbf{x}\_{PV}(t) - \mathbf{x}\_{SV}(t) - l\_{PV}}{v\_{SV}(t) - v\_{PV}(t)} \qquad \text{if } v\_{SV}(t) > v\_{PV}(t), \tag{3}$$

Here, *lPV* denotes the length of the PV; and *vPV*(*t*) is the speed of the PV at instant time *t*.

The interactive effects of relative speed and gap distance are contained in the single measure TTC [35]. Brackstone et al. [36] collected realistic data using an instrumented vehicle equipped with relative distance- and speed-measuring sensors. Observations of vehicle trajectories from five participants showed that TTC is a major factor in lane-changing decisions. Most collision avoidance systems (or pre-crash safety systems) applied in a vehicle use the instantaneous TTC to evaluate collision risk [37]. Moreover, Vogel [38] recommended the use of TTC for the evaluation of safety because it indicates the actual occurrence of dangerous situations. Vogel also noted that a situation with a small TTC is imminently dangerous and that a situation with a small headway and relatively large TTC is a potentially dangerous situation. Therefore, this study proposes the integrated safety payoff function *A<sup>S</sup>* with consideration of not only TTC but also headway, which was formulated using the hyperbolic tangent function, as presented in Equations (4) and (5).

$$A\_{PV,SV}^S = \begin{cases} \left( \tanh\left(\frac{\operatorname{TTC}\upsilon\_{SV}(t)}{t^S} - 1\right) + \tanh\left(\frac{h\upsilon\_{VSV}(t)}{t^S} - 1\right) \right) \times 0.5, & \text{if } \upsilon\_{SV}(t) > \upsilon\_{PV}(t) \\\left(1 + \tanh\left(\frac{h\upsilon\_{VSV}(t)}{t^S} - 1\right)\right) \times 0.5, & o.w. \end{cases} \tag{4}$$

$$A\_{SVLV}^S = \begin{cases} \left( \tanh\left(\frac{TT\mathbb{C}\_{SVLV}(t)}{t^S} - 1\right) + \tanh\left(\frac{hSV\_LV(t)}{t^S} - 1\right) \right) \times 0.5, & \text{if } v\_{LV}(t) > v\_{SV}(t) \\ \left( 1 + \tanh\left(\frac{hSV\_LV(t)}{t^S} - 1\right) \right) \times 0.5, & o.w. \end{cases} \tag{5}$$

Here, *t <sup>S</sup>* <sup>=</sup> min *RDSV vSV*(*t*), 3 denotes the minimum safe time headway between the 3-second rule recommended by the National Safety Council [39] and the time headway to reach the end of the acceleration lane.

The safety payoffs of both drivers for the action strategies were formulated to satisfy *U<sup>S</sup>* ∈ [−1, 1], as shown in Equations (6) to (9).

$$\mathcal{L}I\_{SV}^{S}(s\_1) = 0.5(\mathcal{A}\_{PV,SV}^{S} + \mathcal{A}\_{SV,LV}^{S})\_\prime \tag{6}$$

$$
\Delta l\_{SV}^{S}(s\_2) = -A\_{SV,LV'}^{S} \tag{7}
$$

$$\mathcal{U}\_{SV}^{\mathcal{S}}(\mathbf{s}\_{\mathfrak{I}}) = -\mathcal{A}\_{PV,SV}^{\mathcal{S}} \tag{8}$$

$$
\mathcal{U}^{\mathcal{S}}\_{LV}(l\_1) = \mathcal{A}^{\mathcal{S}}\_{SV,LV} = -l \mathcal{U}^{\mathcal{S}}\_{LV}(l\_2). \tag{9}
$$

For the 'change (*s*1)' action of the driver of SV, *U<sup>S</sup> SV*(*s*1) was formulated as the average of safety payoffs, taking both the PV and LV in the target lane into account. For the 'wait (*s*2)' and 'overtake (*s*3)' action of the driver of SV, on the other hand, the driver's safety payoffs were formulated to consider only the corresponding vehicle related to each action strategy. Likewise, it was assumed that the driver of LV also evaluates their safety in consideration of the SV only.

As shown in the safety payoff formulation, the safety payoffs vary by the spacing between vehicles and each vehicle's speed. Figure 4 shows the prospective safety payoffs of the driver of SV at the various speeds of the three vehicles (i.e., PV, SV, and LV), with the SV in different positions between the PV and LV. In this example, spacing between the PV and LV is constant at 77 m. Figure 4a presents a case in which the SV is located close to the PV. In other words, the lead gap Δ*xPV*,*SV* is small and the lag gap Δ*xSV*,*LV* is large. If *vPV* > *vSV*, *U<sup>S</sup> SV*(*s*1) is greater than *<sup>U</sup><sup>S</sup> SV*(*s*3). Otherwise, the driver of SV is attracted to choosing the 'overtake (*s*3)' action in consideration of safety. In the second case, described in Figure 4b, the SV is located at the middle position between the PV and LV. Therefore, the 'change (*s*1)' action is relatively attractive, i.e., *U<sup>S</sup> SV*(*s*1) <sup>&</sup>gt; *<sup>U</sup><sup>S</sup> SV*(*s*2) and, *<sup>U</sup><sup>S</sup> SV*(*s*1) <sup>&</sup>gt; *<sup>U</sup><sup>S</sup> SV*(*s*3) even if *vSV* is slightly less than *vPV* and *vLV*. The 'overtake (*s*3)' action is attractive when *vSV vPV*, and *<sup>U</sup><sup>S</sup> SV*(*s*2) are greater than *U<sup>S</sup> SV*(*s*1) when *vSV vLV*. The last case, in which the SV is close to the LV, represents the case where the driver of SV is drawn to choosing the 'wait (*s*2)' action if *vLV* > *vSV*. If *vSV* > *vLV*, the 'change (*s*1)' action is more attractive. From these cases, transformed safety payoffs are reasonable to represent the general decision-making results of the driver of SV.

**Figure 4.** Safety payoffs of the driver of SV for the *s*1, *s*2, and *s*<sup>3</sup> action: (**a**) close to the preceding vehicle (PV) (Δ*xSV*,*LV* = 67 m, Δ*xPV*,*SV* = 10 m); (**b**) middle position between PV and LV (Δ*xSV*,*LV* = 38 m, Δ*xPV*,*SV* = 39 m); (**c**) close to the LV (Δ*xSV*,*LV* = 10 m, Δ*xPV*, *SV* = 67 m).

Figure 5 presents the safety payoffs for the driver of LV in the three cases described above. In Figure 5a, which shows that Δ*xSV*,*LV* is considerably large, the driver of LV desires to choose the 'yield (*l*1)' action, except in the case where *vn vn*+1. These payoffs seem to be reasonable because the LV is far away from the SV. In the second case, the 'yield (*l*1)' action is attractive as well. This case is similar to a real field situation, where the lane-changing action of the following vehicle in the target lane mostly shows cooperation in order to accept the merging vehicle's lane change. In the third case, the huge deceleration is expected to provide a gap to the SV because the LV is close to the SV. Therefore, the safety payoffs of the driver of LV for the 'block (*l*2)' action are higher than for the *l*<sup>1</sup> action if *vSV* < *vLV*. Otherwise, the safety payoff of the driver of LV for the 'yield (*l*1)' action is slightly higher, except in a freeway congested traffic condition (i.e., *vSV vLV*).

**Figure 5.** Safety payoffs of the driver of LV for the *l*<sup>1</sup> and *l*<sup>2</sup> action: (**a**) close to the PV (Δ*xSV*,*LV* = 67 m, Δ*xPV*,*SV* = 10 m); (**b**) middle position between PV and LV (Δ*xSV*,*LV* = 38 m, Δ*xPV*,*SV* = 39 m); (**c**) close to the LV (Δ*xSV*,*LV* = 10 m, Δ*xPV*,*SV* = 67 m).

#### 3.3.2. Forced Merging Payoff for the Driver of SV

According to the empirical field data collected at a freeway merging section, the driver of a vehicle entering through an on-ramp usually accelerates for speed-harmonization with freeway vehicles. The driver of SV then selects a gap to merge onto the freeway. In congested traffic conditions, however, the merging vehicles travel at a higher speed than the surrounding vehicles on the freeway. Thus, the driver occasionally rejects the initial gap and then uses a farther forward gap, close to the end of the acceleration lane. Wan et al. found that merging vehicles pass freeway vehicles and try to find an acceptable gap to merge onto the freeway after traveling longer than the normal merging cases in congested traffic conditions [27]. Marczak et al. [40] analyzed data collected at two sites to find variables related to gap acceptance, concluding that the distance to the end of the acceleration lane is a significant variable. Hwang and Park [41] also concluded that the remaining distance is the most

important factor for determining gap acceptance; the driver will most likely accept a smaller gap if the remaining distance to the end of the acceleration lane is smaller. In order to consider the case in which a vehicle merges close to the end of the acceleration lane, the payoff function of the driver of SV should include a term called the forced merging payoff, which relates the remaining distance to the end of the acceleration lane. This affects cases where the driver decides the 'change (*s*1)' action at the decision point where the remaining distance is considerably short.

This study formulated the forced merging payoff as a function of the remaining distance and *vSV*(*t*). There is an assumption that the end of the acceleration lane is an imaginary preceding vehicle that is stopped. The presence of this imaginary vehicle, which is also considered as a hard wall, means the driver of SV cannot drive further, due to the restricted length of the acceleration lane. Thus, the expected safety distance to maintain the instant speed of the SV, *vSV*(*t*), was estimated by a car-following model. This study used the Rakha-Pasumarthy-Adjerid (RPA) car-following model, which was first developed by Rakha et al. [42]. The performance of the RPA car-following model has been validated against naturalistic driving data [43]. This study estimated the safety distance for the SV, *xCF SV*(*t*) using the RPA model's two components: steady-state traffic stream behavior and collision avoidance. The steady-state modeling applies the Van Aerde's steady state car-following model [44,45], which is a non-linear single regime function of vehicle speed and spacing. The first safe spacing (i.e., safety distance) provided by the steady-state model is

$$\mathbf{x}\_{SV}^{\rm CF}(t) = \mathbf{c}\_1 + \mathbf{c}\_3 \mathbf{v}\_{SV}(t) + \frac{\mathbf{c}\_2}{\mathbf{v}\_f - \mathbf{v}\_{SV}(t)}.\tag{10}$$

Here, *vf* indicates the free-flow speed. The model coefficients can be computed as

$$\mathbf{c}\_1 = \frac{\upsilon\_f}{k\_f \upsilon\_c^2} (2\upsilon\_c - \upsilon\_f) \tag{11}$$

$$\mathbf{c}\_2 = \frac{\mathbf{v}\_f}{k\_f \mathbf{v}\_c^2} \left(\mathbf{v}\_f - \mathbf{v}\_c\right)^2,\tag{12}$$

$$\mathbf{c}\_{3} = \frac{1}{q\_{\mathbf{c}}} - \frac{v\_{f}}{k\_{f}v\_{\mathbf{c}}^{2}}.\tag{13}$$

Here, *kj*, *vc*, and *qc* indicate the jam density, speed-at-capacity, and saturation flow rate, respectively. The detailed definition of these coefficients is described in [44].

As the second component of the RPA model, collision avoidance was modeled to avoid incidents at non-steady-state conditions [43]. The second safe spacing estimated by collision avoidance is defined as

$$\mathbf{x}\_{SV}^{CF-2}(t) = \frac{v\_{SV}(t)^2}{2 \cdot a\_{\min}} + \mathbf{x}\_j. \tag{14}$$

Here, *amin* and *xj* denote the minimum acceleration (i.e., maximum deceleration) and the jam spacing, respectively.

The maximum value of two safe spacings, *xCF*\_1 *SV* (*t*) and *xCF*\_2 *SV* (*t*), is considered as the expected safe spacing to maintain current speed.

$$\mathbf{x}\_{SV}^{\rm CF}(t) = \max \{ \mathbf{x}\_n^{\rm CF}(t), \ x\_n^{\rm CF}(t), \ x\_{\rm max}^{\rm RD} \}. \tag{15}$$

Here, *xRD max* is the maximum of the remaining distance, i.e., the longitudinal length of the acceleration lane.

To balance each payoff, this study re-formulated the forced merging payoff of the driver of SV, *UFM SV* .

$$\mathcal{U}\_{SV}^{\rm FM} = \left[ \frac{\max\limits^{\rm CF} \left( \mathbf{x}\_{SV}^{\rm CF}(t) - \mathbf{x}\_{SV}^{\rm RD}(t), 0 \right)}{\mathbf{x}\_{SV}^{\rm CF}(t)} \right]^2. \tag{16}$$

Here, *xRD SV* (*t*) indicates the remaining distance for the SV in the acceleration lane at time *t*. This formulation satisfies *UFM SV* <sup>∈</sup> [0, 1] as shown in Figure 6. If the remaining distance is shorter than *xCF SV*(*t*), *UFM SV* begins to have positive payoffs, inducing a preference for the 'change (*s*1)' action. This term presents greater payoffs when *vSV*(*t*) is faster.

**Figure 6.** Forced merging payoff by the remaining distance at various speeds.

3.3.3. Payoff Functions for the Drivers of the SV and LV

Table 1 represents the updated merging decision-making model in normal form. The payoff functions of the driver of SV consist of both the safety and forced merging payoffs, and those of the driver of LV include the safety payoffs only. In order to capture unobserved utility, both players' payoff functions also have an error term, which was assumed to be normally distributed as ε*SV or LV ij* <sup>∼</sup> *<sup>N</sup>*(0, 1). The parameters in the payoff functions, i.e., set of <sup>α</sup>*ij* and <sup>β</sup>*ij* (*<sup>i</sup>* = 1,2,3 and *<sup>j</sup>* = 1, 2), are parameters to be estimated.

**Table 1.** Game Structure and Payoff Functions of the Merging Decision-Making Game in Normal Form.


<sup>1</sup> *pi* in parentheses denotes the probability assigned to the pure strategy of the driver of SV, *si*; 53 *<sup>i</sup>*=<sup>1</sup> *pi* = 1. <sup>2</sup> *qj* in parentheses denotes the probability assigned to the pure strategy of the driver of LV, *lj*; 52 *<sup>j</sup>*=<sup>1</sup> *qj* = 1.

#### **4. Model Calibration and Validation**

Model evaluation was conducted to prove the efficiency of the game models using the stage game based on the newly formulated payoff functions. This section introduces the observation dataset for model evaluation and calibration methodology. In addition, the calibration and validation results of our previous model and the updated repeated game models are presented.

#### *4.1. Preparation of Observation Dataset*

This study used NGSIM vehicle trajectory data from a segment of U.S. Highway 101 (Hollywood Freeway) in Los Angeles, California, collected between 7:50 and 8:35 a.m. on June 15, 2005 [14,15]. Reasonable classification of the action strategies chosen by the drivers of the SV and LV is a critical issue, as it is directly related to the results of the game model [13]. There is a limitation on the classification of drivers' decisions based on trajectories and speed profile data. This study used a total of 1504 observations extracted from NGSIM data in [13]. For classification of the SV's maneuvers observed in the field, this study used the types of gap that were selected at game-playing moments among the three following gap types (as illustrated in Figure 1a): (1) forward (lead) gap, (2) adjacent (current) gap, or (3) backward (lag) gap. In addition, the spacing between the SV and LV was used for the classification of the LV's maneuvers. A detailed classification methodology is described in [13]. Next, all data were reviewed to judge whether the classification results were reasonable to show drivers' intentions. If the specific data were regarded as improper classification, these data were modified. Decisions made by drivers in all observations were classified using this process.

#### *4.2. Model Calibration*

#### 4.2.1. Calibration Approach

In the game model, each player chooses an action to achieve the goal of the game. In game theory, the Nash equilibrium is a solution to find the optimal set of strategies for both drivers where they have no incentive to deviate from their initial strategy. If the Nash equilibrium exists, it implies that each player will choose the strategy that maximizes their own payoff while considering an opponent who also wants to maximize their payoff. The Nash equilibrium defines pure strategy as

$$\begin{cases} \quad P(s^\*, l^\*) \ge P(s\_{i\prime}, l^\*), & \forall \, s\_i \in \mathcal{S}, \; i = 1, 2, 3\\\quad Q(s^\*, l^\*) \ge Q(s^\*, l\_j), & \forall \, l\_j \in L, \; j = 1, 2 \end{cases} \tag{17}$$

where *s\** and *l\** indicate the equilibrium action strategy of the drivers of the SV and LV, respectively. In this study, if a pure strategy Nash equilibrium does not exist, a mixed strategy Nash equilibrium involves at least one player playing a randomized strategy and no player being able to increase their expected payoff by playing an alternate strategy. A probability for each player's strategy is assigned with consideration of each player's expected payoff from the different strategies [28]. This paper used the MATLAB function N-Person Game (NPG), developed by Chatterjee [46], to solve a two-player, finite, non-cooperative game. Chatterjee's algorithm [46] solves the game by computing the Nash equilibrium in mixed strategies based on the estimated parameters and expected payoffs (i.e., *Pij* and *Qij*). The algorithm provides the probabilities of the choice of pure action strategies for each driver (i.e., *pi* and *qj*) in each observation.

In order to calibrate the merging decision-making model, this study followed the calibration method developed by Liu et al. [9], who proposed a parameter estimation method by solving a bi-level programming problem. As illustrated in Figure 7, the lower-level programming is to find the Nash equilibrium using Chatterjee's function [46]. The upper level is a non-linear programming problem that minimizes the total deviation in probabilities in the system in order to choose actual observed actions using the following function

$$\min \sum\_{k=1}^{n} \left( 1 - p\_{a^k} \cdot q\_{a^k} \right), \tag{18}$$

where *k* denotes the index of observations; *a<sup>k</sup>* is the observed action strategy set (*si <sup>k</sup>*, *lj <sup>k</sup>*) in observation *k*; and *pak* and *qak* are the probabilities that drivers of the SV and LV, respectively, choose the observed action in *ak*. Here, *A<sup>k</sup>* and *B<sup>k</sup>* denote all parameters to be estimated for each driver's payoff functions.

**Figure 7.** Schematic workflow for bi-level programming.

#### 4.2.2. Calibration Results

As mentioned earlier, this study calibrated a total of two types of game model: (1) the one-shot game model, in which the developed stage game is played independently at every game point based on instantaneous status only; (2) the repeated game model using the cumulative payoffs with factor δ of various rates conducted every 0.5s. To verify the performance of the updated payoff functions in predicting human drivers' decisions in merging situations, the first type of model was subdivided into two models according to the payoff functions used in model calibration, as below.


Here, the former and latter models were called the 'previous one-shot game model' and the 'one-shot game model', respectively. For model calibration, an NGSIM dataset observed between 7:50 and 8:20 a.m. was used. The number of observations used in model calibration was 685 (out of 1504). Table 2 shows the estimated parameters of the payoff functions of the drivers of the SV and LV.


**Table 2.** Estimated Parameters of the Payoff Functions for Game Models.

Note that the previous one-shot game model using the payoff functions in [13] was calibrated using the same calibration methodology, but the estimated parameters are not shown in the table because of the different formulation for payoff functions.

In order to compare the models' prediction accuracy, the mean absolute error (MAE) was calculated using Equation (19)

$$MAE = \frac{1}{N} \sum\_{k=1}^{N} \left| 1 - 1(\mathbf{\hat{x}}\_{k} - \mathbf{x}\_{k}) \right| \tag{19}$$

where *N*, *x*ˆ*k*, and *xk* denote the number of observations, model prediction, and actual observations, respectively. Note that 1(*x*ˆ*<sup>k</sup>* − *xk*) is equal to one if *x*ˆ*<sup>k</sup>* = *xk*, and is zero otherwise. The model prediction *x*ˆ*<sup>k</sup>* was estimated by probabilities calculated using Chatterjee's algorithm [46]. Table 3 shows the calibration results for the MAEs of the three types of models. In comparison with our previous model, the one-shot game model using the updated payoff functions shows a higher prediction capacity in merging decision-making. In the repeated game models, the models with δ > 1.0 were calibrated with lower MAEs than those with δ ≤ 1.0.


**Table 3.** Calibration Results.

<sup>1</sup> Not applicable. <sup>2</sup> The number in parentheses indicates prediction accuracy.

#### *4.3. Model Validation*

The rest of the data, 819 observations out of 1504, collected between 8:20 and 8:35 a.m., were used for validating the model, and the validation results are shown in Table 4. Model validation results, which show the same trends as the calibration results, are summarized as follows. First, when comparing the results of the stage game developed in the previous study [13] and this study, the prediction accuracy increases by about 12% when the third stage game is used. Thus, this study enhances the decision-making game model's performance by using the reformulated payoff functions to represent merging maneuvers. Next, in the validation results, the repeated game models with δ ≥ 1.0 show a prediction accuracy of higher than 85%. In particular, the repeated game model shows the highest prediction accuracy when δ = 1.4. Both the one-shot game and repeated game model with δ = 1.4 show a considerably high prediction accuracy of more than 86%. Due to the limitations of unbalanced observation data [12], nevertheless, model validation using field data cannot provide evidence that is beneficial using the repeated game. It is also hard to show the apparent difference between the one-shot game and the repeated game model. In the following sections, therefore, the game models are additionally evaluated through sensitivity analysis and simulation study.

#### **Table 4.** Validation results.


<sup>1</sup> The number in parentheses indicates prediction accuracy.

#### **5. Sensitivity Analysis of the Calibrated Stage Game**

In this section, this study describes the sensitivity analysis conducted to observe how factor changes related to the proposed payoffs impact the stage game results. In reality, drivers' merging behavior to select an acceptable gap size and speed difference between the freeway mainline vehicles and the merging vehicle is different depending on the merging point [27,40]. Hence, this sensitivity analysis is required to demonstrate whether the developed stage game model represents merging behaviors observed in the field in various conditions. To show the decision-making model's sensitivity, the stage game is independently played in diverse scenarios varied by three input factors: game location, relative speed, and spacing. Preparation for the sensitivity analysis is presented first in the following sections, then results and corresponding discussions are provided.

#### *5.1. Sensitivity Analysis Setting*

As shown in Figure 8, a freeway segment that included an on-ramp was used for the analysis, with locations to play a game classified into two areas: the beginning of the acceleration lane and the end of the acceleration lane. For the spacing factor test, the SV changed its position between the PV

and LV. For the speed profile test, the freeway mainline vehicles' speed was basically categorized into five scenarios: 60 km/h, 70 km/h, 80 km/h, 90 km/h, and 100 km/h. In each speed scenario, the SV's speed varied from 60 km/h to 100 km/h. The freeway testbed and calibrated stage game were modeled on MATLAB, and other simulation settings are described below.


**Figure 8.** Topology of freeway merging section for sensitivity analysis.

#### *5.2. Sensitivity Analysis Results*

Based on the results of the stage game played at two locations in various lag spacing and relative speed scenarios, the impact of input factors and other findings revealed by the sensitivity analysis are provided. Figure 9a–e show the results after playing games near the beginning of the acceleration lane, and Figure 9f–j reveal the game results after playing the game near the end of the acceleration lane. The Chatterjee function for finding the Nash equilibrium was used to decide these game results [46]. If the game result in each case is a pure strategy Nash equilibrium, the corresponding action set is a dominant decision made by two drivers, i.e., the probability of one of six action strategies (*pij* × *qij*) is one. Otherwise, when a mixed strategy Nash equilibrium exists, the game result is randomly chosen by probabilities.

Differences in drivers' behaviors based on the merging point are distinct in merging maneuver decisions. Near the beginning of the acceleration lane, a merging vehicle driver usually passes a lead vehicle when *vn* > *vn*−<sup>1</sup> and when lead spacing (Δ*xn*−1,*n*) is quite small [27]. The higher psychological pressure related to merging makes drivers accept smaller gaps as they arrive nearer to the end of the auxiliary lane compared to cases where they can take an original gap near the beginning of the acceleration lane [27]. In other words, field data show that the driver of SV tried a forced merging maneuver at close to the end of the acceleration lane [27,33]. When *vn* < *vn*+<sup>1</sup> and the lag spacing (Δ*xn*,*n*+1) is quite small, the driver of SV waits until the LV passes the SV and then may merge using a backward gap. In Figure 8, the calibrated stage game results show these behaviors in choosing an 'overtake (*s*3)' and 'wait (*s*2)' action according to the game location.

**Figure 9.** Graphical representation of the one-shot game results depending on game locations, spacing between vehicles (Δ*xn*,*n*+1), and speed of the SV (*vn*): (**a–e**) game played at the beginning of the acceleration lane with mainline vehicles driving at 60 km/h to 100 km/h, respectively; (**f–j**) game played at the end of the acceleration lane with mainline vehicles driving at 60 km/h to 100 km/h, respectively. Note that a red line parallel to the x-axis on each graph indicates the speed of the freeway mainline vehicles (*vn*−1, *vn*+1).

Near the beginning of the lane, as illustrated in Figure 9a–d, the game results show that the driver of SV chooses the 'overtake (*s*3)' action in conditions indicative of higher relative speed and

short lead spacing. In contrast, the game results (as illustrated in Figure 9f–i) show that the driver of SV intentionally changes a lane due to a short remaining distance in the acceleration lane. For the 'wait (*s*2)' action, differences in the results of the stage game for merging decision-making are revealed according to game location. These results prove that the forced merging utility works correctly when the SV is close to the end of the acceleration lane. Consequently, the stage game developed in this study accurately depicts realistic decisions made by human drivers according to game location.

As discussed in Section 3.3.3, TTC is critical in making lane-changing decisions. Since TTC is comprised of spacing (i.e., space headway) and relative speed, both are important in human drivers' decision-making for merging maneuvers at freeway merging sections. Hence, this study also analyzed the impacts of these factors. In Figure 9c, blue lines parallel to the y-axis (as marked with -<sup>1</sup> to -3 ) and green lines parallel to the x-axis (as marked with A and B) denote test cases for sensitivity analysis on relative speed and spacing, respectively.

In the sensitivity analysis on relative speed, the PV and LV are supposed to drive at 80 km/h, and the SV's speed varies from 60 km/h to 100 km/h. Scenarios were prepared with three lag spacings: 10 m, 20 m, and 30 m, and the game results of all scenarios are shown in Figure 10. Game results clearly show that the relative speed affects decision-making. When lag spacing (Δ*xn*,*n*+1) is 10 m (as shown in Figure 10a), the drivers of the SV and LV decide on a 'wait (*s*2) and block (*l*2)' action set if Δ*vn*,*n*+<sup>1</sup> ≤ −10 km/h. In addition, both drivers are willing to choose a 'change (*s*1) and yield (*l*1)' action set through the stage game if Δ*vn*,*n*+<sup>1</sup> ≥ −7 km/h. These cooperative action strategy sets are results of both drivers' common consent subject to safety. In a certain range, i.e., −10 km/h < Δ*vn*,*n*+<sup>1</sup> < −7 km/h, drivers' desired actions are competitive; in these conditions, the non-cooperative behaviors, 'change (*s*1) and a block (*l*2)' action, will be carried out.

**Figure 10.** Game results on relative speed: (**a**) Δ*xn*,*n*+<sup>1</sup> = 10 m; (**b**) Δ*xn*,*n*+<sup>1</sup> = 20 m; (**c**) Δ*xn*,*n*+<sup>1</sup> = 30 m.

When Δ*xn*,*n*+<sup>1</sup> = 20 m, in Figure 10b, the driver of the SV and LV choose a cooperative action strategy (*s*1, *l*1) even if Δ*vn*,*n*+<sup>1</sup> = −20 km/h. This means that the relative speed is largely irrelevant in influencing the driver of SV to choose a lane-changing action if there is sufficient spacing between vehicles. If there is enough space headway, real-life experience generally shows that a driver of a merging vehicle will change lane upon reaching an acceleration lane even though a speed harmonization process is required. In response to the merging vehicle's lane change, the driver of LV decreases speed to adjust to the new preceding vehicle (i.e., the SV) or changes a lane to the left to maintain its speed. When Δ*xn*,*n*+<sup>1</sup> = 30 m (i.e., Δ*xn*−1,*<sup>n</sup>* = 10 m), moreover, the game results show a distinct feature depending on the relative speed. The cooperative action strategy (*s*1, *l*1) is chosen by the stage game until *vn* is slightly higher than *vn*−1. If Δ*vn*,*n*−<sup>1</sup> ≥ 8 km/h, the driver of SV chooses an 'overtake (*s*3)' action due to a relatively small TTC in order to avoid harsh braking. Of the overtaking vehicles, 97.7% were found to have a speed higher than the freeway mainline vehicles [27]. Thus, this game model can reasonably represent decision-making results according to relative speed.

For the sensitivity analysis of spacing, the stage game was played with various lag spacing from 0 m to 40 m. The PV and LV are supposed to drive at 80 km/h, and the SV's speed is 70 km/h and 90 km/h. Game results of all scenarios are shown in Figure 11. In the figure, the x-axis indicates the lag spacing (Δ*xn*,*n*+1), and hence an increase in Δ*xn*,*n*+<sup>1</sup> means a decrease in lead spacing (Δ*xn*−1,*n*).

**Figure 11.** Game results on spacing: (**a**) Δ*vn* = 70 km/h; (**b**) Δ*vn* = 90 km/h.

When *vn* < *vn*−1, as shown in Figure 11a, the stage game results show that the driver of SV decides on a 'wait (*s*2)' action in cases in which lag spacing is less than 10 m. In other words, results indicate that a slower SV requires spacing of more than 10 m to choose a 'change (*s*1)' action. Depending on the spacing, competitive decision-making is also expected. This trend is also found in choosing an 'overtake (*s*3)' action when *vn* > *vn*−1. In Figure 11b, the driver of SV decides to overtake at Δ*xn*−1,*<sup>n</sup>* ≤ 12 m. Therefore, the sensitivity results indicate that the stage game reasonably explains the difference in drivers' choices according to spacing.

In the results, decisions included in a non-cooperative action strategy set, i.e., (*s*1, *l*2), are found in a specific decision-making region, as colored black in Figure 9. This region implies that this strategy set, which is decided simultaneously by drivers, puts them into competition. This result means that the driver of SV wants to change a lane after trying to ensure a safe lead and lag gap and the driver of LV does not allow the SV to merge. During the game period, one driver should change their initial decision to avoid a potential collision, and the final decision set would be a cooperative set. In addition, due to an unbalance in the number of observations indicating each action strategy, the (*s*<sup>2</sup> , *l*1) action cannot be determined in this sensitivity analysis. From field data, including NGSIM data, it is clear that merging maneuvers are usually cooperative, as the driver of LV perceives the SV's lane-changing intention. Compared to cooperative merging, non-cooperative cases are only occasionally observed. The stage game results describe cooperative behaviors, and competition between drivers can be found at certain relative speed and spacing profiles. Consequently, the stage game model proposed in this study successfully explains rational human drivers' decision-making results.

#### **6. Simulation Case Study**

In this chapter, a simulation study is presented to demonstrate the performance of the game model based on the developed stage game for merging. For this case study, a microscopic simulation model based on an ABM method that included a vehicle acceleration controller was developed. To verify the performance of the ABM, a comparison between NGSIM data and simulation results is provided. The simulation setting is defined, and then various merging scenarios representing both cooperative and non-cooperative cases are explained. Next, simulation results for each scenario are presented.

#### *6.1. Simulation Model Development*

To investigate whether the repeated game model is efficient to use in microscopic traffic simulation, we used an ABM approach. ABM is a powerful method for making simulations that is widely applied across real-life problems [47–49]. This study developed a simulation model that was built on MATALB

using the ABM method combined with the game model. ABM is a suitable approach for simulating the actions and interactions of intelligent entities, which includes individual people. Collaboration and competition, in particular, are major concerns in game theory; these are two typical types of human interactions addressed in several ABM methods [50]. One of the applicable situations for using ABM is when interactions among agents are heterogeneous and can lead to network effects [48,51]. Thus, this study develops a simulation model to explain merging interactions.

According to Zheng et al. [49], the ABMs explored for the existing transportation system in today's literature, in general, have the distinguishing feature of integration, combining three components: drivers' action decisions, drivers' route decisions, and microsimulation. As a microsimulation component, the simulation model developed in this study basically simulates vehicle movements based on position and by speed profile, as determined by an acceleration controller at each time step. As shown in Figure 12, the controller consists of a game module and a car-following module. According to the game model for the drivers' action decision component, a driver of SV plays a stage game with a driver of LV in the target lane. Depending on the action strategies at each game time, both drivers determine the acceleration level to accomplish their own strategy. In the car-following module, in addition, the desired acceleration level is decided by the RPA car-following model. In this acceleration controller, neither the individual demographic nor the travel characteristics of either agent are considered.

**Figure 12.** Vehicle acceleration controller structure in the developed simulation model.

As the game results show, when the driver of SV chooses a 'change (*s*1)' action, they evaluate lead and lag spacing for gap acceptance to satisfy sufficient spacing and avoid collision. If the instantaneous gap is enough to change lane, the SV begins merging onto the freeway, and the driver of LV determines the acceleration level to follow the SV in the car-following model in response to recognition of the SV's lane-change. In addition, a route decision module is not required because merging scenarios are tested on the one-lane freeway network, which includes a merging ramp.

The car-following module estimates a desired acceleration level based on instantaneous spacing between vehicles and speed at each time step *t*. This study used two components, i.e., steady-state and collision avoidance, of the RPA car-following model for the module [43]. The detailed definition and formulas of the components in the RPA model are described in [43]. Figure 13 shows the performance of car-following module in a case in which five vehicles formed a platoon. Vehicles decide an acceleration level to follow the preceding vehicle by the RPA car-following model. Here, it was assumed that vehicles were located with shorter spacing than the steady-state spacing of Van Aerde's car-following model [44] at simulation time 0. As illustrated in Figure 13, therefore, following vehicles initially decreased speed to ensure proper spacing between vehicles. Then, they began to accelerate after ensuring the sufficient spacing by sequence in the platoon. In conclusion, acceleration level and speed oscillated for a while, and then they were stabilized.

**Figure 13.** Performance of the car-following module.

The game module begins operating as soon as the SV enters the acceleration lane. The nearest following vehicle in the target lane becomes the opposite player. In this module, there are two types of merging game: (1) the one-shot game; (2) the repeated game. In detail, the one-shot game uses instantaneous payoffs, which are computed based on spacing and speed profile at time *t*, for each action strategy set, i.e., *Pij*(*t*), *Qij*(*t*). In the repeated game, on the other hand, the cumulated payoffs are utilized. Regardless of the game type, two players decide an action strategy set subject to the Nash equilibrium. Based upon the action chosen at time *t*, the desired acceleration level for each vehicle is calculated to execute that vehicle's individual action strategy. For the SV, the desired acceleration level is determined as stated below:


In addition, the driver of LV decides the acceleration level for a 'yield (*l*1)' action by accepting the SV's merging intention. To provide safe spacing for merging, the LV's acceleration level was calculated based on the car-following model with an assumption that the SV became a potential lead vehicle. For a 'block (*l*2)' action, on the other hand, the driver of SV shows an acceleration to pass the SV by decreasing spacing. This decrease in spacing is regarded as blocking intention.

#### *6.2. Simulation Model Validation*

Prior to conducting a case study, validation of the simulation model developed in this study was required to determine whether the conceptual model is a reasonably accurate representation of the real world [52] and whether the output of simulations is consistent with real-world output [53]. To validate the simulation model, this study used the graphical comparison technique, in which the graphs of values derived from the simulation model over time are compared with the graphs of values collected in a real system. It is a subjective, yet practical approach, and is especially useful as a preliminary approach [54]. Since the objective of the case study was to verify the repeated game's efficiency, the simulation focuses on presenting microscopic vehicle movements based on rational drivers' decision-making without consideration of individual characteristics. Considering this objective, a mathematical approach, such as statistical testing of simulation results, was not selected for model validation. Therefore, this study provides a graphical comparison between NGSIM data and the results derived from the simulation model to investigate similarity of trend in vehicle position and corresponding spacing.

This study extracted game cases from NGSIM data in which there was no interference by other surrounding vehicles except for the three main vehicles (i.e., the SV, PV, and LV). Next, instantaneous vehicles' location and speed prior to 1.0 seconds in each case were prepared as input data for simulation. The graphical comparison results showing longitudinal vehicle position and spacing are shown in Figure 14. In an example, to show changing situation (see Figure 14a), vehicle position and corresponding lead and lag spacing are almost identical. In an example showing an overtaking situation (see Figure 14b), considerable similarity is observed. The results show that the simulation model based on the ABM represents values similar to those found in the NGSIM data in longitudinal vehicle position and spacing. Consequently, it was possible to conclude that the developed simulation model could be utilized in the case study.

**Figure 14.** Simulation model validation results based on the graphical comparison method: (**a**) changing situation (SV ID: 268, PV ID: 258, and LV ID: 269 in the US101 data collected from 8:05 to 8:20 a.m.) and (**b**) overtaking situation (SV ID: 1108, PV ID: 1112, and LV ID: 1118 in the US101 data collected from 8:20 to 8:35 a.m.).

#### *6.3. Simulation Setting and Cases*

This study conducted case studies in various merging scenarios simulated for a total of five vehicles, including a merging vehicle. Simulation experiments were executed using both the one-shot game model and the repeated game model. As described above, the one-shot game herein is played independently without consideration of previous results at every decision-making point. The repeated game is played based on the cumulative payoffs proposed in Section 3.2. In addition, a freeway segment, including one merging section, was modeled on MATLAB, as illustrated in Figure 15. The length of the freeway mainline was 1.0 km and the 250 m acceleration lane was located 80 m downstream of the beginning of the network. The details of the simulation settings are defined as follows.


**Figure 15.** Simulation network configurations.

A total of five simulation cases were prepared, as summarized in Table 5, to represent plausible merging cases as defined by the diverse input values of three factors: freeway mainline vehicles' average speed (*vfwy*), initial SV's speed (*vn*), and initial lag spacing (Δ*xn*,*n*+1). There are two main categories in merging: cooperative and competitive merging. Cooperative merging cases, in which the drivers' decision set would be collaborative by the common consent of both drivers, indicate typical

cases to select a gap type among three types: a forward gap, an adjacent gap, and a backward gap. In contrast, a competitive merging case represents an example showing a conflict in both drivers' behavior. For example, the driver of SV who wants to use an adjacent gap is willing to prepare to merge onto the freeway by turning a signal on, and then executing a lane change. In that time, the driver of LV decides not to allow the cut-in to avoid the expected considerable deceleration. One of the drivers should change their initial decision in order to avoid a potential collision. This competitive situation is not common, but many drivers may have had an experience of this type. Thus, we picked two cases in order to show not only the game model's performance in non-cooperative cases but also differences between the two game models in competitive scenarios.


**Table 5.** Initial Conditions of Merging Scenarios for Case Study.

#### *6.4. Case Study Results*

Cooperative and competitive cases were tested using the developed simulation model. In order to validate the repeated game model's performance, the simulation results using the repeated model are compared with results using the calibrated stage game model played independently, i.e., one-shot game model at every decision-making point.

In cooperative scenarios, a dominant action strategy is found in rational decision-making due to the apparent situation. The simulation model using the repeated game model shows a very close performance with the model using the one-shot game as the game results are same in each game point. Since there is a mixed strategy Nash equilibrium in the competitive cases, both drivers decide an action strategy depending on the probability of actions. For case study results, this study provides the typical outcome of each scenario if there is no distinct difference in decision-making using the two game models. Otherwise, especially in the competitive scenario, the decision-making output simulation results of each game model are individually presented.

#### 6.4.1. Case 1: Cooperative Merging Scenario Using an Adjacent Gap

In simulation results for the first case, Figure 16 presents that the SV smoothly merged onto the freeway. As described in the sensitivity analysis, the developed game model has the ability to represent drivers' decisions in normal cooperative merging cases. According to the game results, as shown in Figure 17, drivers chose a 'change (*s*1) and yield (*l*1)' action set during the game period. The SV slightly accelerated by speed harmonization rules in preparation for merging while the LV decelerated in order to accept the SV's lane change. When a lead and lag gap was acceptable, the SV merged onto the freeway mainline. In simulation, the driver of SV controlled the vehicle's speed via the car-following rule as soon as it executed the lane change and its following vehicles also showed oscillation in their speed profiles to ensure a safe gap.

**Figure 16.** Graphical representation of simulation results in case 1. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

**Figure 17.** Decision-making game results in case 1.

#### 6.4.2. Case 2: Cooperative Merging Scenario Using a Backward Gap

Simulation results for the second case, as shown in Figure 18, indicate that the driver of SV used the backward gap after the initial LV to overtake the SV. In Figure 19a, the drivers decided on a 'wait (*s*2) and block (*l*2)' action strategy, respectively. The LV accelerated to block merging, and the SV also accelerated for speed synchronization even though the driver of SV decided to take a 'wait (*s*2)' action. As soon as the initial LV overtook the SV, a new merging decision-making game was identified in which the vehicle *n* + 2 became the new LV. The results of the second game are shown in Figure 19b. The SV continuously chose a 'change (*s*1)' action until the gap acceptance rule was satisfied, then moved to the freeway mainline in consideration of gap size and relative speed. The LV, i.e., the vehicle *n* + 2, in the second game decelerated in a yielding action in response to the SV's intention to merge. In conclusion, the merging decision-making model was shown to depict a typical waiting scenario for both game models.

**Figure 18.** Graphical representation of simulation results in case 2. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

**Figure 19.** Decision-making game results in case 2: (**a**) Initial game with *n* + 1; (**b**) additional game with *n* + 2.

#### 6.4.3. Case 3: Cooperative Merging Scenario Using a Forward Gap

In overtaking scenario, the time–space diagram in Figure 20 shows that the SV took the forward gap and then merged onto the freeway. When the SV entered the acceleration lane, as presented in Figure 21a, the SV and LV chose the 'overtake (*s*3) and yield (*l*1)' action set. Although the LV decided the yielding action, it was observed that the LV maintained its speed during the first game period due to observing the SV's passing. After overtaking the lead vehicle, the SV began to decrease speed to harmonize with that of freeway vehicles. As shown in Figure 21b, a new LV, i.e., one which had been the lead vehicle in the first game period, selected the yielding action in interaction with the SV. It therefore showed a deep deceleration during the second game period. The SV maintained on the acceleration lane, then changed lane as soon as the gap acceptance rule was satisfied. As described in the simulation setting, the overtaking scenario is usually observed in congested traffic conditions. Thus, this lane-changing by overtaking action caused a huge oscillation in speed profile because, generally, spacing between vehicles is small under congested traffic conditions. It is concluded that this simulation model based on the proposed game model well represents the induction of a backward-forming shockwave by merging traffic in congested conditions.

**Figure 20.** Graphical representation of simulation results in case 3. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

**Figure 21.** Decision-making game results in case 3: (**a**) Initial game with *n* + 1; (**b**) additional game with *n* − 1.

6.4.4. Case 4: Competitive Merging Scenario Choosing an Adjacent Gap or a Backward Gap (1)

In the fourth competitive merging case, as presented in Figure 22, the SV spent relatively longer time in playing decision-making game than previous three cases. The initial game result of (*s*1, *l*2) is observed in Figure 23a. As a non-cooperative action strategy set, both drivers are in competition to achieve their own objective. At the third decision-making point, a decision they make becomes (*s*2, *l*2) as a cooperative action strategy set. Although the driver of SV initially wanted to change a lane using an adjacent gap as soon as entering an acceleration lane, they change the initial decision in order to avoid collision after recognizing the opposite driver's aggressive behavior. Thus, the driver finally uses the backward gap for merging onto the freeway. From this case, this study concludes that the repeated game model can depict practical changes in drivers' decisions in competitive decision-making, even using the cumulative function.

**Figure 22.** Graphical representation of simulation results in case 4. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

6.4.5. Case 5: Competitive Merging Scenario Choosing an Adjacent Gap or a Backward Gap (2)

In Case 5, the simulation results show the SV used the backward gap for merging onto the freeway whichever game model is used, as illustrated in Figures 24 and 25. This example shows a competition to choose an adjacent gap or a backward gap, as in Case 4. However, there is a difference in that the initial decision is a cooperative action strategy in Case 5.

**Figure 23.** Decision-making game results in case 4: (**a**) Initial game with *n* + 1; (**b**) additional game with *n* + 2.

**Figure 24.** Graphical representation of simulation results in case 5 using the repeated game model. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

**Figure 25.** Graphical representation of simulation results in case 5 using the one-shot game model. Note that a red solid line indicates simulation data of the SV (vehicle *n*) during game period, whereas a blue solid line shows the SV's data in simulation time except game period.

In Figure 26a, when the repeated game model was used, the driver of SV chose a 'wait (*s*2)' action during the first game period and then decided to change lane in the second game period. While decision-making results were maintained using the repeated game model, an oscillation in

decision-making is revealed when the one-shot game is used, as shown in Figure 27a. One reason the one-shot game model causes unstable decision results is that the stage game decides a driver's action in a merging situation based on instantaneous vehicle location, speed, and acceleration data without consideration of previous game results (i.e., decisions made at previous game points). Considering the goal of each action, a change from a non-cooperative strategy set to a cooperative strategy is required in order to avoid a collision (if (*s*1, *l*2) is chosen) or unnecessary deceleration (if (*s*2, *l*1) is selected). However, changes between cooperative action strategy sets (i.e., (*s*1, *l*1) and (*s*2, *l*2)) are not realistic except when there is a surrounding vehicle intervention. This case shows a distinct difference observed in simulation results depending on which type of the two game models is used. Oscillation in decision-making may reduce the performance of microscopic traffic simulation models even though it is only observed in specific competitive merging situations.

**Figure 26.** Decision-making game results in case 5 using the repeated game model: (**a**) Initial game with *n* + 1; (**b**) additional game with *n* + 2.

**Figure 27.** Decision-making game results in case 5 using the one-shot game model: (**a**) Initial game with *n* + 1; (**b**) additional game with *n* + 2.

#### **7. Conclusions**

Drivers' behavior has a big impact on the safety and throughput of the transportation system. This is especially true for traffic conflicts between merging and through vehicles, in that merging vehicles induce shockwaves, which result in a reduction in the roadway capacity resulting in traffic congestion. Consequently, modeling driving behavior thoroughly and accurately is critical when analyzing traffic flow in microscopic traffic simulation and in taking advantage of the advanced vehicle-driving technologies and strategies in AVs. The purpose of this study is to update the repeated game lane-changing model proposed in [13]. This game model has a feature that interprets interaction between drivers, as compared to most lane-changing models, which are focused on the lane-changing vehicle only. In this study, the payoff functions were newly formulated, focusing on not only improvements in prediction performance but also use in microscopic traffic simulators. In the model evaluation, the developed model captured drivers' merging behaviors with a prediction accuracy of about 86%, showing an improvement of about 12% compared to [13]. This study also presented a sensitivity analysis to indicate that the developed model can depict rational merging decision-making according to variations in the related factors: game location, relative speed, and gap size. In order to demonstrate why the repeated game is required in microscopic traffic simulation, moreover, a case study was conducted using the ABM developed to simulate merging situations. Using the repeated game model showed that it had a superior performance compared to a one-shot game model, in which the stage game is independently played, in terms of representing practical merging behaviors in cooperative and competitive merging scenarios.

In order to elaborate on this study as a state-of-the-art lane-changing model, the decision-making model based on the game theoretical approach needs to be expanded as a decision-making model for both mandatory and discretionary lane changing. Since lane-changing-related decision making can be affected by several factors (e.g., road design, traffic stream condition, driving skill, driver's aggressiveness), the model should be calibrated based on the field data collected in various conditions. Lastly, the game model can be applied to advanced vehicle systems, such as AVs, which coexist with human-operated vehicles on the roadway. The model based on the game theoretical approach is anticipated to become an appropriate model to decide lane-changing maneuvers and predict surrounding vehicle drivers' behaviors.

**Author Contributions:** Conceptualization, K.K.; methodology, K.K. and H.A.R; validation, K.K.; simulation, K.K.; formal analysis, K.K. and H.A.R; writing—original draft preparation, K.K.; writing—review and editing, H.A.R.; visualization, K.K.; supervision, H.A.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded partially by the University Mobility and Equity Center (UMEC) and a gift from the Toyota InfoTechnology Center.

**Conflicts of Interest:** The authors do not have any conflict of interest with other entities or researchers.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
