5.2.1. Integrated and Self-Built Image Sets Detection
In order to prove the effectiveness of the training dataset construction and the robustness of the network model design in this paper, Tusimple, Dataset-1 and DataSet-3 are used as the training sets, ResNet-18 is used as the backbone network, and other parameters are set as the same. The network model is trained separately and the visualization qualitative test is carried out on the self-built image sets.
We selected cities, suburbs, villages and highways with different environmental conditions, different road alignments and different traffic scenes, and integrated and self-built four image sets, named SBI-1, SBI-2, SBI-3 and SBI-4, respectively. Among them, the SBI-1 image set mainly covers the scenes of highways, urban straight roads, and simple curves. The SBI-2 image set mainly covers the scenes with lighting at night, reflections at night, and no lighting at night. The SBI-3 image set mainly covers the scenes with pavement damage and pavement shadow, and the SBI-4 image set mainly covers the scenes with slight and severe occlusion.
In the experiment, the newly constructed DataSet-1, DataSet-3 and TuSimple datasets were trained to obtain different network models, and tested on the three image sets. The comparison of the visual results of a typical scene are shown in
Figure 6,
Figure 7,
Figure 8 and
Figure 9. Among them,
Figure 6 is a visual comparison under the SBI-1,
Figure 7 is a visual comparison under the SBI-2,
Figure 8 is a visual comparison under the SBI-3, and
Figure 9 is a visual comparison under the SBI-4.
As can be seen from
Figure 6, when only the Tusimple is used to train the model, the test results only have good adaptability to the highway scene, and the effect is poor for some simple urban straight and curved scenes. However, the model trained by DataSet-1 is not as effective as TuSimple and DataSet-3 on highway scenes. In the urban straight and simple curved scenes, the detection results of the DataSet-1 and DataSet-3 training models lead TuSimple by a great advantage. Among them, the model trained by DataSet-1 may misjudge a straight line as a curve, because most of the dataset is curved scenes. By comparison, the training model of the DataSet-3 performs best on highway, urban straight, and simple curved scenes. Therefore, for the highway, urban straight and simple curved scenarios, the comprehensive performance of the model trained by DataSet-3 is the best.
It can be seen from
Figure 7 that the model trained by the Tusimple basically loses its function and cannot effectively detect lane lines in the scenes with lighting at night, night reflection, and no lighting at night. The models trained by DataSet-1 and DataSet-3 have achieved good results under the SBI-2, but the DataSet-1 is not as effective as DataSet-3 in solving the problem of no vision. Therefore, for scenes with lighting at night, reflective at night, and no lighting at night, the comprehensive performance of the model trained by DataSet-3 is the best.
As seen from
Figure 8, in the scene with obvious illumination changes such as pavement damage and pavement shadow, the model trained by Tusimple is invalid, and the model trained by DataSet-3 has the best test results under SBI-3. DataSet-1 has the condition of missing lane detection and has weak adaptability to complex lighting conditions. Therefore, the comprehensive performance of the model trained by DataSet-3 is the best for the scene of pavement damage, pavement shadow and other obvious illumination changes.
It can be seen from
Figure 9 that the model trained by Tusimple has the worst adaptability and the detection fails under the scenarios of slight or severe occlusion. The model trained by DataSet-1 is missing part of the lane lines, and DataSet-1 is not as effective as DataSet-3 in solving the problem of no field of view, and the shadow affects the lane line detection effect. Therefore, for scenarios such as slight occlusion and severe occlusion, the comprehensive performance of the model trained by the DataSet-3 is the best.
In summary, whether in simple or complex scenarios, the training datasets constructed by DataSet-3 can well-detect lane lines under the same network model and parameter conditions, with the strongest universality, especially in complex environments. Therefore, the model training method proposed in this paper can deal with more scenarios and achieve good results.
5.2.2. Self-Built Video Sequence
To further illustrate the effectiveness and practicability of the detection method in this paper, we also drove a car equipped with a camera to collect road videos outdoors. Similarly, urban, suburban, rural areas and highways under different environmental conditions, different road alignments and different traffic scenes were selected, including the lanes under the straight, curved, night-time lighting, night reflection, night no lighting, pavement shadow, pavement damage, vehicle occlusion, and so forth, and multiple video sequences were built by ourselves. DataSet-3 was combined with the ResNet-18 backbone network, which adds a mixed-attention mechanism to train the model, and the testing effect of the road video sequence was analyzed with the trained model.
The detection results of part of the video frames of the straight and curved road scenes during the day are shown in
Figure 10.
It can be seen from
Figure 10 that our method can accurately detect lane lines when facing the video scene of straight and curved lanes on sunny days. The interference of vehicles and pedestrians in
Figure 10c has also been successfully detected, and has strong robustness to the reflection of the front windshield of vehicles. Lane lines can also be detected successfully in no-vision areas under the hood of a vehicle.
The detection results of some frames of the video scenes with lighting, reflection and no lighting at night are shown in
Figure 11.
It can be seen from
Figure 11 that there is no problem in detecting videos with lighting, reflection and no lighting at night. Similarly, lane lines can be successfully detected in the area without vision, such as facing the vehicle hood occlusion, and even in the case of no lighting in
Figure 11c,d, lane lines can be successfully detected in the curved scene.
The partial frame detection results of the video such as pavement breakage, pavement shadow and vehicle occlusion are shown in
Figure 12.
It can be seen from
Figure 12 that
Figure 12a is pavement breakage interference that vehicles are more likely to encounter during driving,
Figure 12b is pavement shadow interference, and
Figure 12c,d is vehicle occlusion interference. Our method has successfully realized the lane line detection of the video sequences with the above interference cases.
Finally, partial detection results of the tunnel scene video are shown in
Figure 13.
As seen from
Figure 13, the proposed method overcomes various disturbances and achieves accurate detection of lane lines, no matter whether it is in the tunnel or at the tunnel mouth, or whether it is affected by light reflection in the tunnel or illumination at the tunnel mouth.
Through the above self-build video test, it is proved that the model and method in this paper can adapt to most of the lane line geographic information conditions encountered by drivers during driving, and have good robustness and universality for lane line detection of video sequences.
In summary, it can be concluded from the above results that the proposed method has the advantages of fast speed, high accuracy, strong robustness, and a good balance between accuracy and efficiency. However, the number of lane lines detected by the proposed method has an upper limit, which needs to be set manually.