2.1.2. Lane Detection Algorithm

Figure 1 depicts the algorithm used in this study for lane detection. It utilizes OpenCV, an open-source computer vision library, for image processing. After performing the lane detection algorithm, the detected lane information is used to calculate the distance to the object in front of the vehicle.

**Figure 1.** Flowchart of lane detection algorithm.

Figure 2a presents an input image captured by the fixed left camera. When the ROI corresponds to 20–50% of the image height and the resolution of the image is 1920 × 1080, it configures within a rectangular area having the following coordinate points as vertices: (0, 216), (1920, 216), (1920, 540), and (0, 540).

**Figure 2.** Illustration of the steps involved in lane detection: (**a**) the input image, (**b**) selection of the appropriate ROI in the input image, (**c**) the image obtained via grayscale conversion, (**d**) the edge-detected image, (**e**) the image obtained via filtering following Hough transform, (**f**) the color-detected image after conversion into the HSV format, (**g**) the combination of the straight-line-detected image and the color-detected image, (**h**) the identified lane following the perspective transform, (**i**) the generated lane following quadratic curve fitting, (**j**) the final lane-detected image.

Figure 2c depicts the image obtained from the previous one via grayscale conversion. The single-channel image was generated by averaging the pixel values corresponding to the R, G, and B channels.

Figure 2d depicts the result of edge detection. A Gaussian filter was used to remove the noise, and the Canny edge detector was used to generate this edge-detected image. Then straight lines corresponding to the lane marks were obtained, as depicted in Figure 2e. The Hough transform was used to detect the edge components in the edge-detected image. Subsequently, straight lines corresponding to gradients with magnitudes of at most 5◦ were removed, resulting in the elimination of horizontal and vertical lines that were unlikely to correspond to the lane.

The yellow pixels were extracted from Figure 2b, and the result is depicted in Figure 2f. Following the conversion of the image from the RGB format to the HSV format, a yellow color range was selected. When the ranges of the hue channel, saturation channel, and value channel were normalized to the interval 0–1, the yellow pixels corresponded to values between 0–0.1, 0.9–1, and 0.12–1 for hue and saturation channels. One-third of the mean brightness of the image was used for the value channel. When the value of the pixel was within the range that was set, the value was set to 255; otherwise, it was set to zero.

Figure 2g depicts a combination of the image obtained by extracting straight lines to identify pixels corresponding to the lane candidates and that obtained by extracting the color. A combination was obtained by assigning weights of 0.8 and 0.2 to the images in Figure 2c,d, respectively.

Further, the lane candidates were obtained using the sliding window method after removing the perspective from the image presented in Figure 2g, and the output is depicted in Figure 2h. The image was captured in advance such that the optical axis of the camera was parallel to the road when the vehicle was located in the center of the straight road. The image can be warped so that the left and right lanes on a straight road are parallel. The coordinates (765, 246), (1240, 246), (1910, 426), and (5, 516) of the four points on the set of lanes visible within the ROI were relocated to the points (300, 648), (300, 0), (780, 0), and (780, 648) in the warped image to align these along straight lines. A square window comprising 54 pixels was selected, with a width and height that were one-twentieth and one-sixth, respectively, of those of the image. The window with the largest pixel sum was then identified via the sliding window method.

Subsequently, a lane curve was generated by fitting a quadratic curve to the pixels of the lane candidate, as depicted in Figure 2i. The quadratic curve fitting is performed using the least-squares method, and the positions of the pixels of the lane candidate are indicated by the six windows on the left and right lane marks in Figure 2h.

Finally, lane detection based on the input image was completed. The final result, obtained by applying the lane curve to the input image via perspective transform, is depicted in Figure 2j.

#### *2.2. Method for Calibrating Distance Measurement*

#### 2.2.1. Image Distortion Correction

Images captured by cameras exhibit radial distortions due to the refractive indices of convex lenses and tangential distortions due to the horizontal leveling problem inherent to the manufacturing process of lenses and image sensors. Circular distortions induced by radial distortion at the edge of the image and elliptical distortions induced by the tangential distortion require correction. The values of pixels in the distorted image can be used as the values of the corresponding pixels in the corrected image by distorting the coordinates of each pixel in the image [29].

In this study, OpenCV's built-in functions for checkerboard pattern identification, corner point identification, and camera calibration were adopted for image processing. To correct the input image, a 6 × 4 checkerboard image was captured using the camera, its corner points were identified, and the camera matrix and distortion coefficients were

calculated based on the points obtained. Figure 3a depicts the identification of the corner points in the original image and Figure 3b depicts the screen after the removal of distortion.

**Figure 3.** Checkerboard images utilized for distortion correction: (**a**) identification of the corner points in the checkerboard, (**b**) the output image.

#### 2.2.2. Image Rectification

Parallel stereo camera configuration is the method that involves utilizing two cameras whose optical axes are parallel. It is particularly suitable for image processing because of the absence of vertical disparity [30]. In contrast, actual photographs require image rectification to correct the vertical disparity originating from the installation or internal parameters of cameras. This method corrects for an arbitrary object in the left and right images obtained with dual cameras to obtain equal coordinates for the height of images.

In this study, OpenCV's built-in stereo calibration and stereo rectification functions were adopted for image processing. In addition, the checkerboard image utilized during the removal of the image distortion was used to identify the checkerboard pattern and its corner points and calibrate the dual cameras (Figure 4a). As depicted in Figure 4b, the internal parameters, rotation matrix of the dual-camera configuration, and projection matrix on the rectified coordinate system can be obtained based on a pair of checkerboard images captured using the dual cameras.

(**a**) (**b**)

#### 2.2.3. Focal Length Correction

Dual cameras were installed collinearly such that the optical axes of the two cameras are parallel. Furthermore, the lenses were positioned at identical heights above the ground. The 3D coordinates of any object were calculated relative to the camera positions, based on the geometry and triangulation of the cameras depicted in Figure 5. It can be described as follows:

$$\begin{array}{l} \mathbf{Z} = \frac{fb}{d} \\ \mathbf{X} = \frac{Z(xl+xr)}{2f} \\ \mathbf{Y} = \frac{Z(yl+yr)}{2f} \end{array} \tag{1}$$

where:

XYZ—the coordinates of the object; the local coordinate system with their origins at the center of dual cameras.

*f*—focal length.

*b*—baseline.

### *d*—disparity.

*xl*, *yl*—the coordinates of the object in the left camera image plane. *xr*, *yr*—the coordinates of the object in the right camera image plane.

**Figure 5.** Parallel stereo camera model.

The focal length is an essential parameter in the calculation of the Z-coordinate. However, a problem with the use of inexpensive webcams is that some manufacturers do not provide details such as focal length. Further errors originating from image correction necessitate an accurate estimation of the focal length.

This can be achieved by employing curve-fitting based on actual data. Based on the relationship between distance and disparity, where *Za* is calculated from the equation:

$$Z\_4 = \alpha \frac{b}{d} + \beta \tag{2}$$

where:

XYZ—the coordinates of the object in the universal Cartesian coordinate system. *Za*—distance to the object.

α,*β*—the coefficients obtained via the focal length correction.

During testing, images of objects installed at intervals of 0.5 m over the range of 1–5 m were captured. In addition, the differences between the X-coordinates of each object captured by the two cameras were recorded. Then, α and *β* were evaluated by fitting the curve described by the differences calculated via the least square method.

#### **3. Optimization of the Mounting Positions of Dual Cameras**

#### *3.1. Configurations of Test Variables*

#### 3.1.1. Mounting Heights of Cameras

In the proposed equation, the distance is measured relative to the ground. It is evident that the mounting height of the cameras is inversely proportional to the fraction of the ground captured in the image. Therefore, the mounting height of the cameras wields a significant influence in the determination of the region captured in the image.

Heights of 30 cm, 40 cm, and 50 cm were considered. In the case of regular passenger cars, 30 cm was selected as the minimum value because their bumpers are at least 30 cm above the ground level. The maximum value was set to 50 cm because larger heights made it difficult to capture the ground within 1 m. Figure 6 depicts the input images (**a**) (**b**) (**c**)

corresponding to heights of 30 cm, 40 cm, and 50 cm where a baseline of cameras is 30 cm, and an angle of inclination of mounted cameras is 12◦.

**Figure 6.** Input images corresponding to different heights: (**a**) corresponding to a height of 30 cm, (**b**) corresponding to a height of 40 cm, and (**c**) corresponding to a height of 50 cm.

#### 3.1.2. Baseline of Cameras

Equation (1) is based on the geometry and triangulation of the cameras. Therefore, the baseline between the cameras significantly affects the measurement of distance.

Baselines of 10 cm, 20 cm, and 30 cm were considered in this study. First, 10 cm was selected as the minimum value because it was the smallest feasible baseline. Then, the baseline was increased three times at intervals of 10 cm to examine its influence. Figure 7 depicts the input images corresponding to baselines of 10 cm, 20 cm, and 30 cm, where the mounting heights of the cameras are 40 cm, and the angle of inclination of the mounted cameras is 12◦.

**Figure 7.** Input images corresponding to various baselines: (**a**) corresponding to a baseline of 10 cm, (**b**) corresponding to a baseline of 20 cm, (**c**) and corresponding to a baseline of 30 cm.

#### 3.1.3. Angle of Inclination of Mounted Cameras

The installation of cameras parallel to the ground reduces the vertical range and hinders close-range supervision of the ground. Therefore, it is essential to utilize an optimal angle of inclination during the installation of cameras.

Angles of 3◦, 7◦, and 12◦ were considered as feasible angles of inclination. First, 3◦ was selected as the minimum value owing to the difficulty of capturing the ground within a radius of 1 m at smaller angles of inclination from a height of 50 cm. The proportion of the road captured in the image increased as the angle was increased. However, vehicular turbulence or the presence of ramps was observed to affect the inclusion of the upper part of the road in the images. Further, 12◦ was selected as the maximum angle of inclination as it yielded images with the road accounted for 20–80% of the vertical range. Figure 8 depicts input images corresponding to angles of inclination of 3◦, 7◦, and 12◦, where the mounting heights of the cameras are 40 cm and the baseline of the cameras is 30 cm.

**Figure 8.** Input images corresponding to different angles: (**a**) corresponding to an angle of 3◦, (**b**) corresponding to an angle of 7◦, and (**c**) corresponding to an angle of 12◦.
