2. Camera Calibration
The direct relation between calibration and lane departure warning is that camera calibration transforms the image coordinate into the real-world coordinate, and a mapping algorithm remaps the detected lanes coordinate in the image to the actual real-world roads. In this section, we describe a camera calibration method to estimate the camera’s extrinsic parameters, i.e., the height of the camera and the three rotation angles for the transformation from the camera coordinate system to the world coordinate system by three parallel and equally spaced lane-markings on the ground. These extrinsic parameters are necessary for the following lane departure warning step.
Figure 2 shows the positions of the three coordinate systems utilized in this section: (1) the camera coordinate system Oc-XcYcZc; (2) the image coordinate system Oi-XiYi; (3) the world coordinate system Oc’-Xc’Yc’Zc’. Here, Oc-XcYcZc is arbitrary. Oi is at the center of the image sensor (called IMG) where the Zc-axis passes through Oi and is perpendicular to IMG, and f (i.e., the focal length) is the distance between the point Oi and Oc. The Xi- and Yi-axes are parallel to the Xc- and Yc-axes but opposite in direction respectively. Next, the world coordinate system Oc’-Xc’Yc’Zc’ is defined that its origin Oc’ coincides with Oc, its Zc’-axis is parallel to the lane-markings, and its Xc’-axis is parallel to the ground plane.
Next, the 3D imaging model of the lane-markings is established based on the pinhole camera model, as shown in
Figure 3. Each lane-marking and the point Oc determine a plane (α
l, α
m, and α
r), respectively. Here, α
l, α
m, and α
r intersect the IMG in three lines, i.e., l
l, l
m, and l
r, which are the projections of the left, middle, and right lane-markings onto the IMG plane. l
l, l
m, and l
r intersect at the point P
int (vanishing point of the lane-markings), and l
l, l
m, and l
r intersect the bottom edge l
bot) of the IMG at the points P
l, P
m, and P
r. The normal line of l
bot at P
int intersects l
bot at P
p.
Given the positions of ll, lm, and lr in the IMG (i.e., the positions of points Pint, Pl, Pm, and Pr), the coordinate system Oc-XcYcZc is transformed to the Oc’-Xc’Yc’Zc’ following the three steps, (A) rotating θ1 around Zc-axis to make Xc-axis on the ZcZc’-plane; (B) rotating θ2 around Yc-axis to make Zc-axis coincide with Zc’-axis; (C) rotating θ3 around Zc-axis to make Oc-XcYcZc coincide with Oc’-Xc’Yc’Zc’. Meanwhile, the camera height h and then the lane-width w’ are estimated with θ1, θ2, and θ3.
2.1. Rotating θ1 around Zc-Axis to Make Xc-Axis on the ZcZc’-Plane
This step is to rotate an angle around the Zc-axis to make the Xc-axis on the ZcZc’-plane. Afterward, the position of the IMG plane is not changed, but the coordinate system Oi-XiYi is rotated θ
1 around its origin. Therefore, the positions of l
l, l
m, and l
r in the world coordinate system remain unchanged while changed in the image coordinate system Oi-XiYi. Because the Xc-axis is on the plane formed by the Zc- and Zc’-axes, P
int, the intersection of the Zc’-axis and the IMG plane, is still on the Xi-axis as depicted in
Figure 4.
Figure 5 illustrates the imaging differences before (dash lines, denoted by “0” subscript) and after (solid lines, denoted by “1” subscript) this step. tanθ
1 can be obtained by dividing y
int by x
int, and the calculated θ1 forms the rotation matrix R
1 of Oc-XcYcZc in this step. R
1 is then used in the lane departure warning section (i.e.,
Section 3, (7)) together with R
2 and R
3 calculated later on. P
l1P
p1, P
m1P
p1, and P
r1P
p1 can be obtained by (1) (y
size is the height of the IMG).
2.2. Rotating θ2 around Yc-Axis to Make Zc-Axis Coincide with Zc’-Axis
This step is to rotate an angle around the Yc-axis to make the Zc-axis coincide with the Zc-axis.
Figure 6 depicts the position change of the IMG before (dash lines, O
i1) and after (solid lines, O
i2) this step. Then, Zc- and Zc’-axes coincide, while P
int coincides with O
i2. Here, the angle between Zc- and Zc’-axes (i.e., θ
2) can be calculated with (2) by using the distance between points P
int and O
i1, and the calculated θ2 forms the rotation matrix R2 of Oc-XcYcZc in this step. R
2 is then used in the lane departure warning section (i.e.,
Section 3, (7)) together with R
1 and R
3 calculated later on. As in
Figure 7a, the positions of the IMG planes and lines before and after this step are denoted by the “1” and “2” subscripts, respectively. The two IMG planes are perpendicular to the plane (α
bot) determined by the bottom lines of the IMG, i.e., l
bot2 and l
bot1, and α
l, α
m, and α
r intersect α
bot in three parallel lines (l
lcut, l
mcut, and l
rcut). Lines l
lcut, l
mcut, and l
rcut intersect l
bot1 at P
l1, P
m1, and P
r1, and intersect l
bot2 at P
l2, P
m2, and P
r2. Make two normal lines of l
bot1 and l
bot2 at P
int1 and P
int2, respectively, and they intersect l
bot1 and l
bot2 at P
p1 and P
p2. Because l
bot1 and l
bot2 are perpendicular to the Yc-axis, their angle is θ
2. Meanwhile, because IMG
2 is perpendicular to the Zc
2- (which is also Zc’-) axis, l
bot2 is perpendicular to the Zc’-axis and also l
lcut, l
mcut, and l
rcut. As shown in
Figure 7b and Equation (3), using similar triangles, the segments P
l1P
p1, P
m1P
p1, and P
r1P
p1 time cosθ
2 equals P
l2P
p2 P
m2P
p2, and P
r2P
p2 respectively.
2.3. Rotating θ3 around Zc-Axis to Make Oc-XcYcZc Coincide with Oc’-Xc’Yc’Zc’
This step is to rotate an angle around Zc-axis to make Oc-XcYcZc coincide with Oc’-Xc’Yc’Zc’. As illustrated in
Figure 8, we denote the positions of the IMG planes and parallel lines before and after rotating θ
3 around Zc-axis by “2” and “3” subscripts, respectively. Similar to step 2.1, the positions of l
l, l
m, and l
r in the world coordinate system Oc’-Xc’Yc’Zc’ remains unchanged while their positions in the image coordinate system Oi-XiYi are changed. On the other hand, differing from step 2.1, the position of P
int keeps the same (i.e., coincides with Oi) during this step. Because l
bot3 is parallel to the ground plane, the line of intersection (l
gnd) of the IMG plane and the ground are also parallel to l
bot3. From the known condition that the lane-markings are parallel and equally spaced, the two segments cut by the three-lane-markings are equal. By using similar triangles, P
l3P
m3 and P
r3P
m3 cut by l
l, l
m, and l
r are also equal. Accordingly, the calculation of θ
3 is obtained by (4), and the calculated θ3 forms the rotation matrix R3 of Oc-XcYcZc in this step. R
3 is then used in the lane departure warning section (i.e.,
Section 3, (7)) together with R
1 and R
2. In the previous steps, the calculation of θ
1 and θ
2 involved only the positional relationships of points P
int and Oi, and only two lane-markings can determine the position of Pint. However, the intersection points of l
l, l
m, l
r, and l
bot, i.e., P
l3, P
m3, and P
r3, are needed for the calculation of θ
3, which demonstrates that three instead of two lane-markings are necessary in the camera-calibration stage.
2.4. Calculation of Camera Height and Lane-Width
After the above steps, the camera height h, the height of Oc, can be calculated according to similar triangles (in (5), w is the distance between the two adjacent lane-markings of the three parallel and equally spaced lane-markings). Since the three-camera rotation angles and the camera height are the known extrinsic parameters, the lane-width w’ can be calculated by both left and right lane-markings, with the vehicle’s direction aligning with the lane-markings. In the case of two edges of the lane-markings, suppose the two lane-markings in the frame intersect l
bot at the points P
lw0 and P
rw0, and the normal line of l
bot at the intersection point (P
int0) of the two lane-markings intersects l
bot at P
pw0, then w’ is calculated in (6) (shown in
Figure 9).
3. Lane Departure Warning
The extrinsic parameters, i.e., the rotation angles of the camera coordinate system and the camera height, deduced during the camera-calibration stage, are used to calculate the lane departure parameters. In this paper, the yaw angle (θy), which represents the vehicle direction that deviates from the road direction, can be calculated by only one of the two lane-markings projected in the IMG plane. Meanwhile, the distance between the lane-markings and the vehicle (xx) is also important for the lane departure decision. As long as at least one lane-marking is detected in the image by the lane detection technique, the two parameters related to lane departure, θy and xx, can be calculated in this section for lane departure warning.
The 3D imaging model of the lane-markings and the coordinate systems Oc’-Xc’Yc’Zc’ and Oc-XcYcZc have been described in
Section 2. As illustrated in
Figure 10, the vehicle coordinate system Oc’’-Xc’’Yc’’Zc’’ is defined by rotating Oc’-Xc’Yc’Zc’ around the Yc’-axis to make the Zc’’-axis align with the direction of the vehicle. It is observed that the angle between the Zc’’- and Zc’-axis is the yaw angle θ
y.
3.1. Calculation of the Yaw Angle θy
The first step of lane departure warning is to use only one detected lane-marking in the image to calculate the yaw angle θ
y, then the angle between the Zc’’- and Zc’-axes can be calculated by finding out the coordinates of intersection points of the IMG and the Zc’’- and Zc’-axis respectively (shown in
Figure 11). We define the intersection point of the IMG and the Zc’’-axis as P
intc(x
intc, y
intc), which is the same as P
int(x
int, y
int) in
Section 2. Then, the intersection point of the IMG and the Zc’-axis is defined as P
intd(x
intd, y
intd), which is the same as the intersection point of l
l and l
r in the case of two lane-markings detected. Once only one of the two lane-markings is detected (for example, in
Figure 11, l
r is detected, and it intersects the top and the bottom edge of the IMG at two points, P
rtop and P
rbot, whose x-coordinates in Oi-XiYi are x
rtop and x
rbot, respectively), the position of P
intd cannot be determined directly. However, it can be calculated using the extrinsic parameters obtained in
Section 2.
Since Oc’’-Xc’’Yc’’Zc’’ is defined by rotating Oc’-Xc’Yc’Zc’ around the Yc’-axis, the Zc’’-axis is in the Xc’Zc’-plane and the line P
intcP
intd is the line of intersection of the IMG plane and the Xc’Zc’-plane. Therefore, the angle θ
xz between the unit vector
pint0 of the line P
intcP
intd and the unit vector
xi0 of the Xi-axis can be calculated as shown in (7), using the rotation matrices
R1,
R2, and
R3 calculated in
Section 2 when
pint0 is perpendicular to the unit vector
zc0 of the Zc-axis. Then, P
intd can be calculated as the intersection point of the detected l
l or l
r and the line P
intcP
intd as in (8).
Finally, with focal length f and the coordinates of points P
intc, P
intd, θ
y can be solved using the triangular pyramid formed by the axes and the image sensor in
Figure 11 according to (9), and the rotation matrix
Ry between coordinate systems Oc’’-Xc’’Yc’’Zc’’ and Oc’-Xc’Yc’Zc’ is formed by the calculated θ
y.
Ry is then used in the calculation of x
x in
Section 3.2, (10). The vectors
OcOi,
OcPintd, and
OcPintc are called vectors
zc,
zc’, and
zc’’ respectively.
3.2. Calculation of the Distance between the Lane-Markings and the Vehicle xx
In this paper, x
x is calculated using the 3D imaging model of one of the two lane-markings. As an example, in
Figure 12, the right lane l
r is detected in the IMG plane. Make a plane α
per perpendicular to the ground plane through the Zc’-axis, which intersects the ground plane in the line l
per. It is observed that x
x is the distance between l
per and the right lane-marking. The IMG plane intersects the lines l
per and the right lane-marking at two points P
gp and P
gr. Make vectors
vbot,
vlp, and
vlr, which are vectors
PgpPgr,
PgpPintd, and
PgrPintd respectively. Accordingly, x
x is the x-coordinate of
vbot in Oc’-Xc’Yc’Zc’. The angle between
vlr and the bottom edge of the IMG is θ
br, and the angle between
vlr and
vbot is θ
gr. Finally, tanθ
br can be obtained using the coordinates of points P
rbot and P
intd, and x
x can be calculated by θ
br through (10). Here, θ
gr is calculated by using θ
br and θ
xz in step 3.1 for the calculation of the yaw angle θ
y. In particular,
vbot and
vlp can be calculated by using the transformation from Oc’-Xc’Yc’Zc’ to Oc-XcYcZc and the dot product of
vbot and
vlr.
3.3. Lane Departure Assessment
For the real-world application, the departure status of the vehicle is assessed according to the calculated θy and xx. If xx becomes less than a threshold value, the vehicle is approaching the detected lane-marking. If θy becomes less than a threshold value, it means the vehicle is turning toward the detected lane-marking. These two parameters can efficiently and correctly determine the departure status of the vehicle.
Moreover, the vehicle position relative to the other undetected lane-marking can also be obtained with the calculated lane-width w’. The lane width generally depends on the assumed maximum vehicle width with an additional space to allow for the vehicle motion. In the case of only one edge of the lane, the other edge can also be estimated by a typical lane-width w’’. Therefore, the lane departure is easily determined using the Time to Lane Crossing (TLC) criterion or other criteria. Besides roads with lane-markings, when the vehicle is on the road without any lane-markings, the above lane departure warning method can be used to keep the vehicle in one of the leftmost or rightmost edges of the road.
3.4. Lane Detection
As mentioned above, at least one lane-marking should be detected in the image in order to calculate parameters θ
y and x
x. We applied an open-source method proposed by Qin et al. [
29] for the lane detection method used in our study. This method is based on deep segmentation, including a novel lane detection formulation aiming at breakneck speed and no-visual-clue problem. The formulation is proposed to select locations of lanes at predefined rows of the image using global features instead of segmenting every pixel of lanes based on a local receptive field, which significantly reduces the computational cost. Previous experiments show that this method could achieve state-of-the-art performance in terms of both speed and accuracy.
4. Experimental Results
Experiments were conducted at both highways and urban roads, using image sequences captured by a camera mounted on a car with an arbitrary position. At the beginning of the experiments, the camera is calibrated by parallel placing the car to the lane-markings (i.e., the angle between the car direction and the road direction is zero) while all three lane-markings were in the viewfinder of the camera. The reason for placing the car parallel to the lane-markings is that the car direction when taking the calibration image is an object of reference for real driving, and if it fails to be parallel to the lane-marking, the error in the estimation of θ
y becomes large. To avoid the influence of artificial error, we took several calibration images to optimize the parameters. After that, the lane-markings are detected by the lane detection technique. Then, θ
1, θ
2, θ
3, and camera height are calculated as mentioned in
Section 2.
Figure 13a–c shows example frames for camera calibration and the corresponding top view of the experimental environment of highway and urban road experiments, respectively.
After the calibration step, the position and orientation of the experimental car were arbitrarily changed to simulate the real driving situation while the pose of the camera (i.e., camera coordinate system) relative to the car keeps stable. Then, an image (called “driving image”) was taken and a steel tape manually measured two parameters: (a) the distance from the camera to one of the lane-markings (x
x); (b) the yaw angle of the experimental car to the lane (θ
y). The lane detection technique is used again to detect the clearest lane-marking in the “driving image”, and the two-lane departure parameters, i.e., θ
y and x
x, are estimated using the previously mentioned lane departure warning method.
Figure 13d–f shows example frames for lane departure assessment and the corresponding top view of the experimental environment of highway and urban road experiments, respectively. To test the algorithm in different situations, the camera’s pose was arbitrarily changed five times in the highway experiment and four times in the urban road experiment. Finally, the estimated quantities and the actual measured values were compared, and the errors were calculated.
KITTI odometry dataset was initially created for visual odometry or SLAM algorithms [
30]. It is almost the only benchmark dataset with ground truth in its NO.00-11 image sequences, including the camera coordinate of each image gathered by a GPS/IMU system. According to the transformation matrix of each image, the vehicle’s deviation angle θ
y of each image can be deduced. However, the KITTI does not provide the real values of the distances from the camera to the lane-markings, which impossibly assesses the parameter x
x. KITTI can objectively quantify the performance analysis of the proposed algorithm and the state-of-the-art works without manually measuring the errors.
Figure 14 shows example frames of the KITTI odometry dataset.
Table 1 tabulates the experimental results for lane departure assessment. It is observed that the average error of θ
y is about 1 degree, and the average error of x
x is less than 5 cm. Causes of errors probably include the small deviation of the vehicle orientation during the calibration process and the measurement errors of real values of the camera positions. To directly evaluate the effect of the warning algorithm, lane departure criteria on both θ
y and x
x were defined to calculate the correct warning rate. For the highway experiment, the distance from the car’s front wheel to the lane-marking replaces x
x as a criterion since this parameter is more direct for departure judgment. For the KITTI dataset experiment, because x
x cannot be estimated, only θ
y is the criterion.
Table 2 compares this work to six state-of-the-art algorithms where their previous experiments are mainly conducted on their dedicated datasets but not on a public dataset. As in this work, these six algorithms provided their formulas, respectively. This makes the algorithms available to be re-implemented on other datasets. Since these six algorithms only need the expressions of the detected two lane-markings in the images, therefore, the dataset should contain information on at least two lane-markings. Another condition is setting the threshold values in each algorithm while few algorithms give their threshold values. We take the threshold values during the experiments and software-based simulations, resulting in the best correct warning rate.
Finally, only 604 in 1546 frames include two lane-markings in our dataset. The best performance on our dataset among all six is [
15], which failed to reach the 90% correct warning rate. The main reason is that these algorithms need two lane-markings while the angles between the two detected lane-marking lines might change drastically with the deviation angles of the camera. For example, the angle bisector of the detected two lane-markings, the parameter of lane-departure judgment in [
18], is mainly affected by camera rotation around Zc-axis.
On the other hand, the proposed algorithm is compared with [
25], which uses the 3D imaging model to calculate the θ
y and x
x parameters. The comparison result is shown in
Table 3, indicating the performances of the algorithms are both excellent and almost the same. High accuracy indicates the advantage of combining the image information and the road model.
Other than the decision-making parameters θ
y and x
x, the accuracy of camera height h and lane-width w’ is also important. We experimented with camera height and lane width in the laboratory, and the result is shown in
Table 4. The total frames for testing h and w’ are 205 and 780, and the average errors are less than 2% and 3%, respectively.
Figure 15a shows example frames of the h and w’ experiment.
As for the curved roads, the tangent line of a curve plays the same role as the “straight lane-marking.” Therefore, the parameter θ
y is the vehicle direction that deviates from the tangent line of the curved lane-marking, and the parameter x
x is the distance between the vehicle and the tangent line of the curved lane-marking. The experiment for curved roads is carried out to estimate the parameter x
x, while θ
y is not estimated because the direction of the tangent line changes as the vehicle moves, and it is hard to measure its real value. In
Table 5, the error of x
x is 17.29 cm, and the correct warning rate is 89.25%. The difficulty in detecting the tangent line may cause an error increase, but the main reason is the camera’s field of view (FOV), which causes the difference between the tangent point and the point for measuring x
x (called the x-point), as shown in
Figure 16. x-point is the closest point on the lane-marking to the vehicle, so the distance from the vehicle to the x-point is the real value of x
x. However, for the precise detection of the lane-markings in front of the vehicle, the camera should face forward, so the camera usually cannot capture x-point, and the point nearest to the x-point which the camera can capture is tangent-point. Therefore, the road’s curvature causes an error between tangent-point and x-point. The greater the curvature of the lane-marking is, the more significant the difference between the slopes of the tangent lines at tangent-point and x-point is, the bigger the error of the calculated x
x is. This error can be possibly reduced by estimating the position of the x-point with advanced algorithms in future research. It may be more natural to detect the curved lane-markings and use the detected curve in the image but not the tangent line to determine lane departure. However, it will be much more complex to calculate the projective relation between a curve in the world coordinate system and that in the image coordinate system. Therefore, using a tangent line is the best solution for a curved lane departure warning.
Figure 15b shows example frames of the curved road experiment.