To gain a thorough understanding of how well MALNet performs, various segmentation experiments were carried out across the CDM_P, CDM_H, and CDM_C data sets, followed by a detailed analysis of both quantitative and qualitative outcomes.
4.2.1. Quantitative Results and Analysis
To quantitatively analyze the extraction results of the segmentation model, this study used the OA, P, R, F1, and IOU indicators to evaluate the test set results on CDM_P, CDM_H, and CDM_C. The F1 score and IOU reflect the model classification prediction result and the accuracy and completeness of the segmentation; therefore, this study focuses on comparing these two indicators.
- (1)
Results of the network performance test on the CDM_P data set
Table 1 presents the results of the network performance test on the CDM_P data set. The CDM_P data set serves as a publicly available resource for damaged road marking detection. Collected primarily during daylight hours, with a few instances during dusk, the data set benefits from favorable lighting conditions. The damaged road markings exhibit excellent contrast with their surroundings. Consequently, the performance of the 11 segmentation models on the CDM_P data set generally surpasses their results on the CDM_H and CDM_C data sets.
Among these models, MALNet demonstrates superior performance on the CDM_P data set, closely followed by the BiSeNet model. This observation underscores the effectiveness of both models in preserving local detail information while capturing global contextual cues. Specifically, they excel in identifying the boundaries of damaged road markings, resulting in more complete segmentation outcomes. MALNet, a lightweight damaged road marking segmentation network proposed in this study, leverages knowledge distillation. It incorporates multi-scale feature fusion and adaptive spatial attention mechanisms to enhance segmentation precision and completeness. By transferring knowledge from a teacher model to a student model through distillation strategies, MALNet achieves improved performance while minimizing computational resources. The BiSeNet model, with F1 and IOU scores reaching 83.99% and 72.39%, respectively, closely trails MALNet. Notably, BiSeNet operates as a bidirectional segmentation network, effectively utilizing spatial and contextual branches to retain spatial information and adapt to diverse scenarios and damage types.
However, the performance of the remaining nine models on the CDM_P data set is relatively modest. EaNet exhibits a high precision (P) value but lower R and IOU values. This discrepancy suggests an overemphasis on positive samples during segmentation, leading to inaccuracies and incompleteness. ConvNeXt and MAResUNet achieve high F1 scores but low IOU values. These models correctly identify most marking regions (high recall) but lack precise boundary matching (low precision), resulting in suboptimal IOU. LANet, which focuses on local attention, notably underperforms compared to other models. This limitation may stem from the dense micro-features inherent in damaged road markings, where fine granularity matters. SegFormer records the lowest F1 and IOU values (68.90% and 52.56%, respectively). The absence of position encoding likely hampers its ability to effectively retain spatial information. Given that damaged markings vary in shape and size across different locations, accurate spatial context is crucial. Position encoding could enhance the model’s understanding of spatial layouts.
- (2)
Results of the network performance test on the CDM_H data set
The results of the network performance test on the CDM_H data set are shown in
Table 2. The CDM_H data set focuses on detecting damaged road markings specifically on highways. Although the types of damage are relatively uniform, the nighttime data collection—while challenging due to limited illumination—provides a unique context. Unfortunately, the reduced lighting conditions contrast with damaged road markings and their surroundings are less pronounced. Nevertheless, the data set comprises 3113 images, allowing models to learn from ample samples. Consequently, the segmentation performance achieved on the CDM_H data set closely approximates that of the publicly available CDM_P data set, which contains 980 images.
Among the evaluated models, MALNet stands out on the CDM_H data set, achieving an impressive F1 score of 83.75% and an IOU (Intersection over Union) of 72.04%. Notably, these metrics significantly surpass those of other segmentation networks. In comparison to similar models, MALNet outperforms the second-ranked LinkNet by 0.85% in F1 and 1.25% in IOU. The success of LinkNet, particularly on the CDM_H data set, can be attributed to its design as a road-specific segmentation network. Leveraging deconvolution and element-wise addition operations, LinkNet achieves feature upsampling and fusion. Interestingly, when considering the student models before knowledge distillation, MALNet-18 performs comparably to BiSeNet, LinkNet, EaNet, and MAResUNet. However, through knowledge distillation, MALNet experiences significant performance gains without altering its model structure. The distilled model outperforms other models in the same category. On the other hand, LANet exhibits the lowest F1 and IOU values (71.59% and 55.75%, respectively). This outcome may be attributed to the dense micro-features inherent in damaged road markings. While LANet enhances feature representation through local attention, it may lack the necessary finesse when handling such intricate details, resulting in compromised segmentation accuracy and completeness.
- (3)
Results of the network performance test on the CDM_C data set
The network performance results on the CDM_C data set are presented in
Table 3. The CDM_C data set serves as a comprehensive urban damaged road marking detection data set, encompassing various scenarios such as lane markings, intersections, main roads, and side roads. The diversity of damage types within this data set poses a unique challenge. However, due to nighttime data collection—characterized by insufficient illumination—the contrast between damaged road markings and their surroundings is less pronounced. Despite these complexities, the CDM_C data set comprises 1718 images, representing different geographical regions across China, including Chongqing, Wuhan, Shanghai, and Beijing.
Compared to the CDM_H highway data set and the publicly available CDM_P data set, the overall performance of the 11 evaluated models on the CDM_C data set is not particularly promising. Several factors contribute to this outcome.
- (1)
Nighttime Data Collection: The predominantly nighttime data collection introduces low image quality and increased noise. Insufficient lighting diminishes the visibility of damaged road markings, making their differentiation from the surrounding environment challenging.
- (2)
Spatial Heterogeneity: The diverse sampling locations (Chongqing, Wuhan, Shanghai) introduce spatial heterogeneity. Variations in road materials, colors, and humidity across different regions impact spectral differences, potentially affecting model generalization.
- (3)
Variety of Road Types: The CDM_C data set covers a wide range of road types, including urban main roads, side streets, and intersections. The multitude of damaged road marking types presents a complex scenario. However, the data set’s sample size relative to the diversity of damage types remains insufficient, limiting the model’s robustness.
Despite these challenges, MALNet consistently achieves optimal results on the CDM_C data set. This underscores MALNet’s ability to leverage multi-scale feature fusion and adaptive spatial attention mechanisms, effectively addressing segmentation complexities in intricate urban road scenes. Its robustness and generalization capabilities remain noteworthy.
- 2.
Overall Analysis of Quantitative Experimental Results
The experimental results underscore the superiority of the MALNet series models across all three data sets. These models consistently maintain excellent performance even in the face of diverse damage types and complex backgrounds, demonstrating their robustness and generalization capabilities. Notably, MALNet outperforms the original student model, MALNet-18, across all evaluation metrics, particularly achieving significant improvements in the F1 score and IOU. This enhancement highlights its accuracy and resilience.
From a structural perspective, the MALNet series models consistently achieve optimal performance across all tested data sets, affirming their robust stability and broad applicability. Remarkably, on the publicly available CDM_P data set, MALNet-101 leads the pack with an impressive overall accuracy (OA) of 99.59% and an IOU of 74.54%. This result vividly showcases its exceptional ability to accurately identify correct samples and differentiate between various types of damaged road markings.
Considering the data sets, the CDM_P data set primarily comprises images captured during daylight hours, benefiting from favorable lighting conditions that create an ideal testing environment. Although the CDM_H data set predominantly consists of nighttime images with limited visibility, its substantial sample size still yields results comparable to those of the CDM_P data set, emphasizing the importance of ample samples for effective model learning and adaptation. Meanwhile, the CDM_C data set presents the most challenging conditions, including diverse urban road scenes and insufficient nighttime illumination. Nevertheless, MALNet continues to achieve outstanding performance, showcasing its adaptability in complex scenarios.
Regarding knowledge distillation, MALNet effectively leverages the knowledge from the teacher model, resulting in significant performance gains without altering the model’s structure. This technique not only enhances the student model’s performance but also makes the model lightweight, making it suitable for deployment in resource-constrained environments.
In summary, the MALNet series models stand out due to their balanced performance across various data sets, affirming their effectiveness and reliability in detecting damaged road markings across diverse road scenarios. Their accuracy, completeness, and robustness make them the preferred choice for semantic segmentation tasks under diverse conditions.
4.2.2. Qualitative Results and Analysis
To comprehensively showcase the performance of the MALNet model, we specifically examine three data sets: CDM_P, CDM_H, and CDM_C. By maintaining consistent loss functions and learning rates, we compare MALNet against other methods. Our analysis aims to highlight the advantages of MALNet in handling challenging features present in damaged road marking images.
- (1)
Results of the network performance test on the CDM_P data set
Figure 7 illustrates the segmentation results of the CDM_P data set, showcasing commendable performance across all 11 models. This achievement likely stems from the data set’s inherent characteristics. CDM_P was meticulously collected during daylight hours, ensuring favorable illumination conditions. Consequently, the damaged road markings exhibit pronounced contrast with the surrounding environment, minimizing interference and facilitating accurate segmentation.
A comparative analysis of the segmentation outcomes reveals that MALNet consistently stands out. Its segmentation results exhibit clear boundaries and high completeness, surpassing other models. Notably, MALNet incorporates an innovative design—the multi-scale dynamic selection convolution module. This novel approach automatically adapts sub-convolution kernels based on input feature content and task demands, dynamically adjusting the receptive field. Remarkably, MALNet achieves broader decoding context information without introducing additional parameters, effectively enhancing both segmentation accuracy and completeness.
- (2)
Results of the network performance test on the CDM_H data set
Figure 8 showcases the segmentation results of the CDM_H data set. Notably, the depicted road markings represent typical examples of blurred markings—where the overall contour remains discernible, but the details have significantly deteriorated due to wear and tear.
Remarkably, MALNet’s segmentation results closely approach those of the teacher model, MALNet-101, successfully extracting clear and complete damaged road marking contours. This achievement can be attributed to MALNet’s innovative use of the multi-scale dynamic selection convolution module. This novel design allows MALNet to automatically adapt sub-convolution kernels based on input feature content and task requirements, dynamically adjusting the receptive field. Consequently, MALNet effectively captures a broader decoding context without introducing additional parameters, thereby enhancing segmentation accuracy and completeness.
However, BiSeNet’s segmentation results, while relatively complete, exhibit an error in the lower right corner, misclassifying non-road marking areas as damaged road markings. This discrepancy may arise from BiSeNet’s underutilization of spatial information, leading to insufficient differentiation between road markings and the background. Although BiSeNet excels in rapid inference speed, further improvements are necessary to handle complex scenarios involving blurred road markings.
Additionally, LinkNet and EaNet achieve partial segmentation in limited regions, resulting in missing segments. These models demonstrate limitations when dealing with blurred road markings. While LinkNet prioritizes lightweight and efficient design, its performance in complex scenarios requires enhancement. EaNet, on the other hand, should better leverage spatial information to improve segmentation outcomes.
Furthermore, SegFormer, MAResUNet, and LANet only partially segment the damaged road markings, failing to capture their complete details. SegFormer, characterized by its use of Transformer structures, has room for improvement in detail representation. As for MAResUNet and LANet, their designs need to better address complex damage scenarios.
- (3)
Results of the network performance test on the CDM_C data set
Figure 9 illustrates typical examples of urban road marking wear. Unlike highways, urban road conditions vary significantly due to inadequate maintenance and frequent accidents. Consequently, the overall wear on urban roads is more severe than that observed on highways. In the example, the road markings have almost worn away, leaving only faint contours visible.
Remarkably, the MALNet series models continue to demonstrate outstanding segmentation performance, successfully extracting relatively complete contours of the damaged road markings. MALNet’s success can be attributed to its innovative use of the adaptive spatial and channel attention modules. These modules enhance the feature expression capacity of the segmentation branches, enabling the model to better capture spatial information related to road damage and markings. Additionally, MALNet’s adaptive fusion module dynamically adjusts the output weights of the segmentation branches, automatically selecting more suitable fusion strategies based on input images. This adaptability significantly improves segmentation robustness.
However, BiSeNet’s segmentation results, while relatively complete, exhibit an error in the lower right corner, misclassifying non-road marking areas as damaged road markings. This discrepancy may arise from BiSeNet’s underutilization of spatial information when dealing with geometrically complex and irregularly edged road markings, leading to insufficient differentiation between damaged markings and complex backgrounds. Moreover, LinkNet, LANet, ConvNeXt, and SegFormer achieve segmentation only in limited regions of the urban damaged road markings, resulting in missing segments. These models demonstrate limitations when handling highly worn and blurred markings. While LinkNet emphasizes lightweight and efficient design, its performance in complex scenarios requires improvement. EaNet, on the other hand, achieves relatively complete segmentation results, benefiting from its external attention mechanism. This mechanism utilizes two small, learnable, shared-memory units to effectively extract image features. Notably, this approach achieves spatially effective image segmentation, particularly when dealing with severely worn road markings.
- 2.
Overall Analysis of Qualitative Experimental Results
The experimental results demonstrate that the MALNet series models consistently outperform other methods across all three data sets. This performance underscores their effectiveness in handling diverse damage types and complex backgrounds. MALNet excels in segmenting all types of damaged road markings, exhibiting superior accuracy and completeness compared to other models. These findings highlight MALNet’s robustness and generalization capabilities, enabling it to adapt to different damage types and lighting conditions.
From a structural perspective, the MALNet series models dynamically adjust the receptive field using the multi-scale dynamic selection convolution module. This innovation enhances the precision and completeness of the segmentation branches. Additionally, the adaptive spatial and channel attention modules augment feature expression, allowing MALNet to better capture spatial information related to road damage and markings. Furthermore, the model dynamically adjusts the output weights of the segmentation branches, automatically selecting fusion strategies suitable for different input images. This adaptability significantly improves the overall robustness of the segmentation branches.
Regarding knowledge distillation, MALNet effectively learns from the teacher model, acquiring additional detail expression capacity through multi-level distillation. This process enhances the performance of the student model’s segmentation branches, achieving results comparable to those of the teacher model. Remarkably, MALNet achieves this with fewer parameters.
In summary, the MALNet series models stand out due to their balanced performance across various data sets, affirming their effectiveness and reliability in detecting damaged road markings across diverse road scenarios. Compared to the original student model, MALNet demonstrates accuracy, completeness, and robustness, making it a powerful segmentation model suitable for a wide range of practical applications.