Next Article in Journal
Predicting the Characteristics of High-Speed Serial Links Based on a Deep Neural Network (DNN)—Transformer Cascaded Model
Previous Article in Journal
Review of Key Technologies in Modeling and Control of DC Transmission Systems Based on IGCT
Previous Article in Special Issue
Design of Multi-Band Bandstop Filters Based on Mixed Electric and Magnetic Coupling Resonators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing Autonomous Vehicle Perception in Adverse Weather: A Multi Objectives Model for Integrated Weather Classification and Object Detection

Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(15), 3063; https://doi.org/10.3390/electronics13153063
Submission received: 14 May 2024 / Revised: 22 July 2024 / Accepted: 29 July 2024 / Published: 2 August 2024
(This article belongs to the Special Issue Advances in the System of Higher-Dimension-Valued Neural Networks)

Abstract

:
Robust object detection and weather classification are essential for the safe operation of autonomous vehicles (AVs) in adverse weather conditions. While existing research often treats these tasks separately, this paper proposes a novel multi objectives model that treats weather classification and object detection as a single problem using only the AV camera sensing system. Our model offers enhanced efficiency and potential performance gains by integrating image quality assessment, Super-Resolution Generative Adversarial Network (SRGAN), and a modified version of You Only Look Once (YOLO) version 5. Additionally, by leveraging the challenging Detection in Adverse Weather Nature (DAWN) dataset, which includes four types of severe weather conditions, including the often-overlooked sandy weather, we have conducted several augmentation techniques, resulting in a significant expansion of the dataset from 1027 images to 2046 images. Furthermore, we optimize the YOLO architecture for robust detection of six object classes (car, cyclist, pedestrian, motorcycle, bus, truck) across adverse weather scenarios. Comprehensive experiments demonstrate the effectiveness of our approach, achieving a mean average precision (mAP) of 74.6%, underscoring the potential of this multi objectives model to significantly advance the perception capabilities of autonomous vehicles’ cameras in challenging environments.

1. Introduction

The rapid advancement of autonomous vehicle (AV) technology has captured the attention of researchers, engineers, policymakers, and the public. Central to AV development are sensors that enable perception and decision-making within dynamic driving environments. Among these, camera sensors play a vital role as the primary source of visual perception in the AV systems. Cameras capture real-time high-resolution images of the vehicle’s surroundings, providing crucial visual data for the accurate detection and classification of various objects. By leveraging advanced object detection algorithms, cameras contribute to various AV functionalities such as lane keeping and path planning by continuously monitoring lane markings and changes in road layout. This enables the vehicle to maintain its position within lanes and make informed decisions regarding trajectory and maneuvering, thereby enhancing overall road safety and traffic flow. Furthermore, camera sensors contribute to path planning by identifying obstacles, traffic signs, and other entities, enabling the vehicle to adapt its trajectory accordingly and navigate complex traffic scenarios.
Depth estimation is another key capability of camera sensors, enabling AVs to perceive the distances of surrounding objects accurately. Through advanced image processing techniques, cameras can provide depth perception, enhancing the vehicle’s spatial awareness and obstacle avoidance capabilities. Image segmentation is an additional task performed by camera sensors, wherein the visual scene is segmented into semantically meaningful regions. This segmentation enables the AV to distinguish between various elements within its field of view, facilitating robust object detection and classification, essential for safe navigation. Moreover, camera sensors play a crucial role in the fusion of perception, integrating data from multiple cameras positioned around the vehicle to construct a comprehensive situational awareness map. This fusion enhances the vehicle’s understanding of its surroundings, enabling it to make informed decisions in real time. In addition to external perception, camera sensors also contribute to cabin monitoring and provide essential data for passenger behaviors and status. Figure 1 lists the main roles of cameras and their systems in AVs.
In addition to their primary functions, cameras offer a cost-effective and lightweight solution compared to alternative sensor technologies such as LiDAR and radar. This affordability facilitates widespread adoption and deployment of AV technology, paving the way for a future where autonomous vehicles are ubiquitous on our roads.
Having said that about the functions of cameras, reaching the highest level of AV development, “full automation” [1], requires vehicles to detect every object in their surrounding environment under all conditions and scenarios with no less than human-like behavior. Achieving this level is still very challenging during adverse weather, which presents significant challenges for camera sensors, impacting their abilities to capture clear and reliable images of the environment. Weather conditions such as rain, snow, fog, and sandstorms pose significant challenges for camera sensors, and the key challenges include:
  • Reduced visibility: adverse weather conditions often lead to reduced visibility, impairing the effectiveness of camera sensors in capturing clear images of the environment. Rain, snow, and fog can obscure the field of view, making it challenging for cameras to discern objects and obstacles accurately.
  • Water droplets and snow accumulation: rain and snow can result in water droplets or snow accumulation on camera lenses, leading to distortion, blurring, or occlusion of captured images. This accumulation can degrade image quality and hinder object detection and recognition capabilities.
  • Fog and haze: foggy conditions create a hazy atmosphere that reduces contrast and clarity in camera images, hindering object detection and localization. The presence of fog and haze makes it difficult for cameras to distinguish objects from their background, compromising the reliability of AV perception systems.
  • Glare and reflections: glare from wet road surfaces or reflective surfaces can cause reflections in camera images, resulting in overexposed or washed-out images. Glare and reflections can obscure important visual information, making it challenging for AVs to navigate safely in adverse weather conditions.
  • Sand and dust particles: sandstorms and dusty conditions can lead to the accumulation of sand and dust particles on camera lenses, obstructing the field of view and degrading image quality. This accumulation of particles can compromise the performance of camera sensors, affecting the reliability of AV perception systems.
  • Dynamic lighting conditions: adverse weather conditions can cause rapid changes in lighting conditions, including variations in brightness, contrast, and color temperature. Camera systems must adapt to these dynamic lighting conditions to maintain accurate perception of the environment and ensure reliable object detection and recognition.
  • Sensor calibration: adverse weather conditions may necessitate adjustments to camera calibration parameters to compensate for changes in lighting, visibility, and sensor performance. Ensuring accurate sensor calibration is essential for maintaining the reliability and effectiveness of camera sensors in adverse weather conditions.
  • Reliability and robustness: adverse weather conditions pose reliability and robustness challenges for camera sensors, requiring them to continue functioning effectively in harsh environmental conditions. Ensuring the durability and resilience of camera sensors is crucial for the safe and reliable operation of AVs in adverse weather.
Figure 2 shows some object detection challenges in adverse weather. Each type of weather has its own obstacles. For instance, in heavy snow and sandstorms, the road boundaries can be obscured by snow and sand. During rain, water droplets on camera lenses lead to distortion and blurring of captured images. All of the abovementioned challenges make it difficult for the camera to accurately perceive objects.
When exploring object detection using convolutional neural networks (CNNs), we encounter two primary approaches: one-stage and two-stage. The two-stage approach, pioneered by the introduction of the Region-Based CNN (R-CNN) model in 2014, involves a region proposal stage to identify regions containing objects, followed by feature extraction and object classification [2]. However, this method was slow due to processing each proposed region separately. Fast R-CNN, introduced the following year, improved the speed by passing the entire image through the CNN, generating a feature map for object detection [3]. Faster R-CNN [4] was then introduced to further enhance the performance. A significant advancement occurred in 2017 with the introduction of Mask R-CNN [5]. Mask R-CNN adopts the Feature Pyramid Network (FPN) as its backbone [6] and introduces a novel phase to the detection process by generating a segmentation mask for each object.
On the other hand, the one-stage approach, first demonstrated by Redmon et al. [7] with the YOLO model, encapsulates the entire detection process in a single pass through the CNN. YOLOv2 and subsequent iterations like YOLOv3 [8] introduced a new backbone named Darknet-53 to the architecture. A new version of YOLO called YOLOv4 then proposed aiming at improving both accuracy and speed and achieved notable improvements over its predecessors. YOLOv5, YOLOv7 [9], and YOLOv8 were later iterations of YOLO. Another model called the Single-Shot multibox Detector (SSD), proposed in [10], has achieved competitive results on the VOC2007 dataset, with improvements of the mAP.
In this paper, our aim is to propose a solution for AVs based on camera sensors that can not only detect objects, but also classify weather based on the condition of the scene. The scope of our paper is shown in Figure 3.
The main contributions of our work are as follows:
  • We propose a multi objectives model for classifying weather and detecting objects. As we will demonstrate in Section 2, and to the best of our knowledge, existing AV papers treat classifying weather and detecting objects as separate problems. Our proposed model treats road weather classification and road object detection as a unified problem for camera sensing systems.
  • We have expanded the Detection in Adverse Weather Nature (DAWN) dataset by adding augmented images that cover all four types of weather conditions (sandy, rainy, foggy, and snowy). The total dataset size has nearly doubled, increasing from its original size.
  • In addition to providing a single model for classifying weather and detecting objects, we address a critical gap in autonomous vehicle research by considering sandy weather conditions, which have been largely overlooked by existing studies.
  • The base architecture of the object detection model You Only Look Once (YOLO) version 5 has been adapted and modified to suit our domain. As a result, we have successfully increased the mean average precision (mAP) to 74.6%, which is a promising result compared to other papers that used the same dataset.

2. Related Work

Detecting objects in challenging weather conditions presents difficulties because the quality of images degrades and visual features are compromised due to weather phenomena like rain, fog, snow, and sandstorms. These conditions impact detection performance by diminishing scene lighting, reducing object visibility, and complicating object differentiation from surrounding elements. Several papers have been published aiming to propose suitable solutions.
In [11], the authors studied adverse weather classification along with light level in the AV environment. To tackle the issue of perception under adverse weather and low light conditions, where accuracy degradation is a significant concern, the authors introduced their own dataset. The dataset was designed to cover three types of weather (fog, rain, and snow) and three levels of lighting (bright, moderate, and low), along with three street types (asphalt, grass, and cobblestone). The processed images of the dataset contain three labels related to the type of weather, lighting level, and street type. The authors used ResNet18 as their backbone and concluded that the system performed with low accuracy on the dataset and needed further enhancement.
In [12], the authors address the challenges of AVs during adverse weather conditions, where typical perceptual models struggle. Existing research mainly focuses on classifying weather conditions; however, the authors studied the transitions between these types of weather. They proposed a method to define and understand six intermediate weather transition states (cloudy to rainy, rainy to cloudy, sunny to rainy, rainy to sunny, sunny to foggy, and foggy to sunny). The approach involves interpolating intermediate weather transition data using a variation autoencoder, extracting spatial features with VGG (Visual Geometry Group) very deep convolutional networks, and modeling temporal distribution with a gated recurrent unit for classification. The authors proposed a new large-scale dataset called AIWD6 (Adverse Intermediate Weather Driving), and the results showed an effective weather transition model.
In [13], the authors introduces a novel framework called WeatherNet, which employs four deep CNN models based on the ResNet50 architecture. WeatherNet autonomously extracts weather information from the input image and classifies the output into the right category. However, the drawback of the presented framework is the inability to share features, since the four models work separately.
Ref. [14] focuses on the significant impact of adverse weather conditions on urban traffic and highlights the importance of weather condition recognition for applications such as AV assistance and intelligent transportation systems. Leveraging advancements in deep learning, the paper introduces a new simplified model called ResNet15, a proposed version of the famous ResNet50 [15]. The proposed model has a fully connected layer that used the Softmax classifier. The paper also introduces a new dataset called “WeatherDataset-4” containing about 5000 images covering foggy, rainy, snowy, and sunny weather. Although the proposed network outperformed the traditional ResNet50, the paper lacks coverage of nighttime and sandy environments.
In [16], the authors proposed the MCS-YOLO algorithm to enhance object detection by integrating a coordinate attention mechanism, a multiscale structure for small objects, and applying the Swin Transformer structure [17]. Through experiments on the BDD100K dataset, they demonstrated a mean average precision (mAP) of 53.6%.
Paper [18] is one of the earliest papers that applied CNN for AV weather classification. The authors added two fully connected layers to extract features from Road Service Conditions (RSC) images. The paper focused on winter road conditions, where the problem of snowy roads was divided into three experiments: (a) two-class classification, (b) three-class classification, and (c) five-class classification. The model surpassed traditional classification techniques and recorded an accuracy of 78.5% when applying five-class classification. In [19], YOLOv4 has been enhanced to detect objects through proposing an anchor-free and decoupled head. The paper used BDD100k as original dataset and created a new version that focuses on three types of weather (rainy, snow, foggy). The experimental results showed a mAP of 60.3%.
In [20], the authors extracted high-precision motion data and proposed a new vehicle tracking mechanism called SORT++. Image-Adaptive YOLO (IA-YOLO) was presented in [21] and showed an improvement in detecting objects in low light and foggy environments.
Ref. [22] proposed Dual Subnet Network (DSNet) for detecting objects and achieved a mAP of 50.8% of foggy weather. In [23] YOLOv5 was investigated to detect objects of several classes, and the mAP of all classes scored 25.8%. In [24], drone images were created and applied to a modified version of YOLOv5, which scored a mAP of about 50%. Paper [25] compared YOLOv3, YOLOv4, and Faster R-CNN performances during different types of weather (rainy, foggy, snowy). The paper concluded that YOLOv4 outperformed YOLOv3 and Faster R-CNN.
Table 1 shows a summary of recent publications for weather classification and object detection in the AV environment. While standard object detection models primarily focus solely on the detection process, our work and proposed model introduces several key differences compared to recent related studies. First, we have incorporated a new phase in our model called the “Quality Block,” designed to assess and enhance the observed scene. Second, we have added an adjustable threshold score to reduce the number of images entering the enhancement phase. Third, our study uniquely addresses sandy weather conditions, which have not been considered in recent publications.

3. Methodology

Our methodology for developing a model capable of both weather classification and object detection in severe weather started by applying Detection in Adverse Weather Nature (DAWN) dataset [26]. We focused on covering four key weather types (sandy, rainy, foggy, and snowy) with six classes (pedestrian, bicycle, car, motorcycle, bus, and truck). To expand the dataset and introduce a new variation of the existing images, we have included data augmentation in our work. This augmented dataset was combined with the original DAWN dataset to increase the number of training samples. A full description of the augmentation will be provided in Section 6. We then partitioned the combined dataset into training and validation sets. Our split percentage is 80% of the images for training, while (20%) were used for validation and testing (10% for validation and 10% for testing). The training set was used to train both the weather classification and object detection models, while the validation set served the critical role of preventing overfitting. After that, optimization steps are involved to find the best performance of the model by changing the hyperparameters. Lastly, we evaluated the optimized models using the standard mean average precision (mAP), precision, and recall metrics. Figure 4 shows the sequence of our methodology.
To optimize computational efficiency given limited GPU resources, we employed Google’s cloud-based Colab platform as our experimental environment. Colab provides a PyTorch machine learning framework and high-performance GPUs (such as the Tesla T4). Through Colab we were able to effectively execute our experiments, specifically with the integration with CUDA (Compute Unified Device Architecture), which helps in accelerating the computational process of our pipeline and the CNN part while detecting objects (specifically within tasks like convolution, pooling, normalization, and activation layers).
Various metrics are available for quantifying the efficacy of object detection models. In our paper, we prioritized three main metrics: (a) mean average precision (mAP), (b) precision, and (c) recall. mAP stands as a prevalent evaluation metric within the domain of object detection, offering a holistic assessment of the model’s proficiency in object identification and localization. mAP combines precision and recall by computing the average precision (AP) for each object class or category, subsequently deriving the mean across all classes. AP serves as a measure of the detection’s quality, encapsulating both the precision of accurately identified objects and the completeness of detection in the scene. Through computation of mAP, our pipeline performance can be numerically compared and evaluated across diverse domains and scenarios.
We also considered precision and recall as indispensable metrics in the context of object detection. Precision is the proportion or the percentage of retrieved elements that are relevant to correct class, while recall measures the percentage of relevant objects that are successfully retrieved. Precision is expressed as the ratio of true positives (TPs) to the sum of true positives and false positives (FPs), represented as:
Precision = TP/(TP + FP)
Recall is the ratio of TP to the sum of true positives and false negatives (FNs), represented as:
Recall = TP/(TP + FP)

4. Dataset

For the dataset, as we mentioned earlier, we used DAWN in our development and experimentation. The DAWN dataset covers four types of adverse weather: sandstorm, rain, snow, and fog. Figure 5 shows a sample of the various weather types covered in DAWN. The dataset contains 1027 images covering the four types of weather and different environmental contexts such as highways, freeways, and urban landscapes, ensuring a broad representation of real world scenarios.
Although many other datasets cover adverse weather conditions, the DAWN dataset has the advantage of including sandstorm or sandy weather images, which are often absent from other datasets. This unique feature of DAWN has allowed our model to address weather classification and object detection in multiple types of geographical environments. The used DAWN dataset, originally consists of 1027 images with a size of 640 × 640.
Image annotation contains the class of the object and the corresponding boundaries of: x, y, width, and height of the bounding box (x_center, y_center, width, height). Figure 6 represents a sample of our labeled images, considered as a ground truth reference.

5. Proposed Model

Our proposed pipeline establishes a comprehensive weather classification and object detection by integrating four main tasks: (1) image quality assessment, (2) image enhancement, (3) weather classification, and (4) object detection. We have combined task 1 and 2 into one block called “Quality Block,” while tasks 3 and 4 are combined in another block called “Classify and Detect Block.” Once the image enters the model, an assessment will be made to evaluate the image quality. The purpose of this step is to make a decision of whether the captured image is in need of an improvement or not. The Blind Reference-less Image Spatial Quality Evaluator (BRISQUE) method [27] has been employed to quantify image quality. If the entered image has a score that is higher than a threshold point (low quality), the image will be disapproved and transferred to an enhancement stage; otherwise, it will be approved and transferred directly to the classification and detection block. The threshold point can be changed and modified based on the scene situation; for instance, in our experiment we have used a threshold of 42.7, as we will explain in Section 7 “Experiments and Results.” It is worth noting that, in the BRISQUE method, generally a lower score indicates better perceptual quality, while a higher BRISQUE score indicates worse perceptual quality. The BRISQUE algorithm has several advantages that made it a suitable solution for our model and for evaluating adverse weather scenes. Firstly, it is a no-reference image quality metric, in that it does not require a perfect reference image for comparison. This is highly advantageous in adverse weather conditions, where obtaining ideal, undistorted images can be challenging, if not impossible. BRISQUE functions by analyzing the natural scene statistics (NSS) of an image and comparing them to the expected statistics of natural (undistorted) images. Any deviations from this naturalness are flagged as indicators of quality degradation—making it a good fit for detecting the kinds of distortions introduced by weather phenomena. Moreover, BRISQUE offers computational efficiency compared to several other options, which might be important when working with large image datasets or in scenarios where real-time quality assessment is desired. For the image enhancement phase, we have used the Super-Resolution Generative Adversarial Network (SRGAN) technique [28], which consists of generator and discriminator networks. The generator network aims to upscale low-resolution images, while the discriminator network aims to refine the generator’s output, resulting in improved image clarity. Following the Quality Block, the image is processed by two YOLOv5 networks. One YOLO network, trained extensively on a dataset of weather-labeled images, accurately classifies weather conditions such as sandy, rainy, snowy, or foggy. Simultaneously, a separate YOLO network, trained to identify and localize objects with bounding boxes, will detect the targeted objects, such as cars, cyclists, pedestrians, motorcycles, buses, and trucks. Generally, our proposed model offers a two-pronged approach, prioritizing image quality before seamlessly transitioning to robust YOLO-based weather classification and object detection for reliable image analysis. Figure 7 shows an illustration of our proposal.

6. Augmented DAWN

Given the limited number of adverse weather images in the DAWN dataset, we built a new version of DAWN using augmentation to scale up our experimental dataset. Data augmentation is a widely used method to artificially enlarge datasets by creating new training images from a currently available dataset. Various papers, such as [29,30], and the incremental improvement of YOLOv3 [8] have employed data augmentation either for their weather classification or object detection datasets. Our augmented version has increased the number of DAWN dataset from 1027 images to 2046 images, which is an increase by almost double the current size. Figure 8 shows a general overview of the DAWN dataset before and after our applied augmentation.
Augmentation techniques encompass a range of image adjustments, such as image scaling, rotation, cropping, flipping, color adjustment, noise, or blur, and many other operations. Applying augmenting for weather classification and object detection can be very advantageous for the following reasons:
  • Enhancing the diversity and variability of training data, aiding the model’s generalization to unrepresented scenarios.
  • Boosting the model’s resilience against factors affecting object appearance, such as varying lighting conditions, occlusions, or viewpoint changes.
  • Addressing class imbalance by oversampling minority classes or undersampling majority ones.
  • Mitigating overfitting by introducing regularization and noise into the training data. The following sections describe the augmentations we performed in this paper.

6.1. Rotation

This is used to present the object from different angles of view. In real-word scenarios, objects might appear at different angles or rotations, and adding this in our augmentation can help the model to better handle these view variations.

6.2. Hue

Hue is a color-based image augmentation technique that alters the hue or color tone of an image while preserving its brightness and saturation.

6.3. Noise

We have also incorporated synthetic noise into our augmentation process to expand our dataset. This type of augmentation enhances our model’s resilience to noise and enhances its capability to adapt to new data or scenarios.

6.4. Saturation

Saturation adjusts the intensity of colors within an image. By saturating an image, we effectively scale the pixel values by a random factor within a specified range. Increasing the saturation value of an image can make the colors more vibrant and vivid, while decreasing it can make the colors more subdued and muted. We augmented the saturation of our dataset by approximately 25%.

6.5. Grayscale

We incorporated grayscale augmentation, which converts an image into grayscale. This technique is commonly used to increase the contrast of an image and enhance its details.

6.6. Brightness

By randomly increasing the brightness of images, we subjected our model to a broader range of lighting conditions, thereby enhancing its resilience to changes in illumination. We augmented the images, rendering them approximately 15% brighter.

6.7. Blur

Blur is used to introduce out-of-focus effects into images. For our augmented data, we used Gaussian blur with up to 1.25 px.

6.8. Exposure

Additionally, we artificially modified the exposure level of the images, setting it in the range of 10% to −10%.

6.9. Cutout

We have also cut small parts of objects in the scene. The purpose of this is to add occlusion to our experiment, which is to block, cover, or obscure an object from the camera view.
Table 2 shows our augmentation setting values and their impacts on images.

7. Experiments and Results

To test our model, we have conducted several experiments, starting by setting set our BRISQUE threshold score to 42.75. This score is the average quality score for the DAWN dataset, and any image above this average score will go through the enhancement stage. Table 3 explains the reason behind choosing 42.75 as our threshold point and illustrates the impact of image quality by measuring BRISQUE scores before and after augmentation. The table compares images from the DAWN dataset (sandstorms, rain, snow, and fog) with our extended augmented images of the same dataset aiming to simulate the adverse weather conditions. Across all weather conditions, augmented images generally exhibit higher BRISQUE scores, indicating a decline in image quality compared to the original DAWN images. As the table shows, augmented images are worse by about 9% when it comes to the average scene quality. This low quality of the augmented images can be attributed to the performed augmentations (blur, hue, saturation, noise, cut, brightness, and exposure), which are usual effects during adverse weather. The observed differences underscore the importance of designing a quality assessment model to preserve image quality, particularly in adverse weather conditions where visual clarity is crucial for accurate scene observation and object detection.
The experimental scenario for the augmented DAWN dataset was executed within the Google Colab environment, harnessing the computational power of a Tesla T4 GPU. We have made several modification to the YOLOv5 architecture, aiming to create a suitable model for our domain. This modification includes changing the activation functions and test the model with SiLU and LeakyRelu functions. We also have modified the backbone and head to test the performance of BottleneckCSP and C3 architectures. In addition to that, hyperparameters such as epochs and batch size have been changed throughout our experiments.
After designing our model, we initiated our experimental phase by implementing BottleneckCSP as both the backbone and head architecture. Our model demonstrated promising results, achieving a mean average precision (mAP) of 55.6% and 45.6% when trained for 128 epochs with a batch size of 32, utilizing SiLU and LeakyReLU activation functions, respectively. From Table 4, it can be clearly seen that, when we implemented BottleneckCSP in our model, the mAP was increasing for SiLU and LeakyRelu functions whenever we increased the number of epochs. It can be also seen that LeakyRelu has lower performance than SiLU with the BottleneckCSP backbone and head. Table 4 shows the complete results of our model using BottleneckCSP.
We continued our experiments by moving to include Concentrated-Comprehensive Convolution (C3) [31] as a backbone and head in our proposed model. The model achieved a better result, achieving 71.8% mAP using SiLU with only 32 epochs and 16 batches, as Table 5 shows. This score is higher than LeakyRelu by 7.4 percentage points, with the same metrics. This result is also higher than the highest score achieved using the BottleneckCSP backbone (Table 4), which was 55.6%. We continued to increase the number of epochs and batches until we achieved 74.6% after 64 epochs with 16 batches, which is the highest score mAP in this paper. As we will discuss below, this score is the highest mAP score compared with other and recent object detection publications that used DAWN as a base dataset. The mAP, precision, and recall of our model can be seen in Figure 9. The top left chart shows the precision result, with a score of 85%, while the top right chart shows the recall result reaching 68%. The bottom chart shows the resulting mAP, which reached 74.6%. Figure 10a shows our F1 score, and we can clearly see that the peak for most classes occurs at confidence thresholds between 0.4 and 0.6. This suggests that setting the model’s confidence threshold within this range would likely yield the best balance between precision and recall. It also shows that the model excels at detecting cars, while it struggles more with trucks, with overall performance following a similar pattern to the average across all classes. Figure 10b represents our mAP, with intersection over union thresholds from 0.5 to 0.95.
The previously mentioned tables, Table 4 and Table 5, highlight a notable observation: increasing the number of epochs does not consistently correlate with higher mean average precision (mAP). Surprisingly, our model achieved its highest mAP score when trained for 64 epochs rather than 128 epochs, contrary to the initial expectation. This finding highlights the importance of careful hyperparameter tuning and validation experimentation in identifying the most effective training regimen for a given model architecture and dataset. Among recent publications utilizing the DAWN dataset, our method’s mean average precision (mAP) of 74.6% is the highest achieved, as detailed in Table 6.
For weather classification evaluation, the proposed model scored an accuracy of 74.3% after 64 epochs, as Figure 11 shows. The model has successfully classified most of the scenes; however, there are some cases where the model failed to classify the true weather. For instance, if we look at Table 7, which shows the experimental result of weather classification, in image number 5 the true weather was a strong sandstorm, whereas the model classified it as foggy weather. This case is an example of where the brightness and lighting of the scene could be challenging for weather classification models in adverse weather.
Figure 12, Figure 13 and Figure 14 are examples from our experiments where the model successfully classified the type of weather in the scene and detected the objects. The top left corner shows the probability scores for the current weather, ranking them from highest probability to lowest. For instance, in Figure 12, the model classified the scene as sandy weather by 87%, which is correct and matching the ground truth. It is also successfully detected the two vehicles appearing in the scene, with detection percentages of 96% and 94%, respectively.

8. Discussion

While the preceding section demonstrates the potential of our method to detect various vehicles in adverse weather, in the following points we will discuss key insights and observations that emerged during the development of this work:
  • If we look at our F1 score (Figure 10), the “car” class consistently achieves the highest F1 scores across different confidence levels, indicating that the model is particularly adept at detecting cars accurately. Conversely, the “truck” class generally exhibits the lowest F1 scores, suggesting that the model might have more difficulty distinguishing trucks, or faces more false positives/negatives in this category. The “all classes” curve represents the average performance across all object classes, demonstrating a similar trend to the individual classes, with the peak F1 score around the 0.6 confidence threshold.
  • As Table 6 shows, our proposed work achieved a mAP of 74.6%. This result surpasses the performance of other publications on the DAWN dataset, including ensemble methods [32], YOLO modifications [33], GAN-based architectures [34], the LDETR transformer [35], and YOLO enhanced with metaheuristic algorithms [36].
  • Notably, DAWN is a very challenging dataset, as corroborated by our own experience and underscored by the observations of the authors in [33], who remarked: “[w]e find the DAWN dataset a bit more challenging than the others.” This challenge is due to the fact that some images and objects are characterized by exceedingly low visibility, which is a factor that may impact the resulting score of any developed model.
  • The domain of object detection in adverse weather still requires more reliable datasets that provide sufficient variability to cover various object appearances, lighting conditions, and occlusions. Creating such datasets is time-consuming and costly. A recently published paper by Liu et al. [37] demonstrated a simulator-based approach that allows easy manipulation of environmental conditions, object placement, and camera perspectives. Using simulator-based data collection opens the door to diverse and comprehensive datasets without extensive real-world data gathering. This approach can expedite data collection by setting up and executing various adverse weather scenarios without, for instance, waiting for the weather’s seasonal changes. Furthermore, it offers data scalability, overcoming the geographical constraints of real-world data collection.
  • While existing recent publications and public datasets offer valuable resources for object detection in various weather conditions, there is a clear need for more work that includes sandy weather scenarios.
  • Combining images with LiDAR using fusion can be a promising approach for enhancing object detection in autonomous vehicle environments. Recent studies, such as those by Dai et al. [38] and Liu et al. [39], have demonstrated that this technique significantly improves object detection in challenging environments by leveraging the complementary features of both LiDAR and cameras. Cameras provide a cost-effective, lightweight solution that captures rich color and texture details, aiding in the classification and identification of objects. On the other hand, LiDAR offers precise distance measurements and 3D spatial information, which are particularly useful in low-visibility conditions where cameras may struggle. By fusing the data from both sensors, the accuracy and robustness of object detection systems can be greatly enhanced.
  • We extended our experiments to test our model using the UAVDT dataset [40]. The original UAVDT dataset comprises over 77,000 images captured in daylight, night, and foggy weather conditions. After running the experiment for 64 epochs, we achieved the following results: mAP of 94.1%, recall of 90.8%, and precision of 97.0%. We believe that the UAVDT dataset requires additional preprocessing before it can be fully utilized. For instance, adjusting the time frame for capturing images could help diversify the resulting images.
  • Synthetic data can be used to address the challenges and limitations of real-word datasets. In a recent publication [41], the authors proposed CrowdSim2, a synthetic dataset, for object detection tasks, particularly detection of people and vehicles. Such a technique can be very beneficial for the AV domain by offering a controlled environment where factors like weather conditions, object density, and lighting can be precisely manipulated, enabling the testing of object detection models under various scenarios. Additionally, it can be used to simulate rare but critical events, such as accidents or unusual obstacles, which may be underrepresented in real-world datasets.

9. Conclusions

Classifying weather and detecting objects in severe weather environments is a critical and challenging task. In this paper we introduced a multi objectives model that integrates weather classification and object detection and treats them as a unified problem within the domain of autonomous vehicle perception systems. Our model consists of two main blocks. First, the Quality Block checks image quality based on BRISQUE score, and if the image has a score that is higher than the threshold then it proceeds further to be enhanced by an SRGAN method. Second, the Classify and Detect Block classifies four types of adverse weather (snowy, rainy, foggy, and sandy) and detects six classes (car, cyclist, pedestrian, motorcycle, bus, and truck). During our development, we utilized the challenging DAWN dataset as our source of images and employed YOLO as the base structure for classification and detection. The experimental results show that our model achieved a mean average precision (mAP) of 74.6% for detecting objects using the YOLO architecture with C3 architecture as a backbone and SiLU as an activation function. Additionally, for classifying the weather of the scene, our model attained an accuracy of 74.3%, which closely aligns with the mAP. Having said that, there are still some challenges in the domain that should be considered while developing detection and classification models. Changes of the scene characteristics such as lighting and cloudiness lead to wrong classification of the correct weather.

10. Future Work

Adverse weather is still a very challenging domain in AV environments. To achieve the highest level of automation, camera sensors are in need of a robust system that is capable of navigating safely in diverse weather scenarios and capable of accurately observing the surroundings. In the future, we will extend our domain to include additional datasets that could be merged with the current DAWN dataset. This could lead us to expand our detection classes to include more detailed and new classes that we observe in real driving environment, such as traffic lights, children, domestic animals (like dogs), and law enforcement personnel (such as police officers). Each of these classes represents integral components of the road scene, and accurately detecting and responding to their presence is essential for ensuring the safety and efficiency of autonomous driving systems. By incorporating these additional classes into our detection framework, we aim to enhance the overall mAP. Additionally, we aim to enhance the perceptual capabilities of autonomous systems through perceptual fusion, which involves combining information from multiple sensors, such as cameras, LiDAR, radar, and ultrasonic sensors, to create a comprehensive and accurate representation of the surrounding environment. By developing such a robust system, we believe that we can mitigate the impact of adverse weather conditions on sensor performance and enhance the reliability and robustness of general AV perception systems.

Author Contributions

Software N.A.; data analysis, N.A., A.A. and A.B.; data curation, N.A.; methodology, N.A; writing—original draft preparation, N.A.; writing—review and editing, A.A. and A.B.; supervision, A.A. and A.B.; conceptualization, N.A., A.A. and A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. J3016_202104; Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. Society of Automotive Engineers (SAE): Warrendale, PA, USA, 2021.
  2. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
  3. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Seville, Spain, 17–19 March 2015; pp. 1440–1448. [Google Scholar]
  4. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed]
  5. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Jaipur, India, 18–21 December 2017; pp. 2961–2969. [Google Scholar]
  6. Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Penang, Malaysia, 5–8 November 2017; pp. 2117–2125. [Google Scholar]
  7. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Bangalore, India, 20–21 May 2016; pp. 779–788. [Google Scholar]
  8. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
  9. Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
  10. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
  11. Dhananjaya, M.M.; Kumar, V.R.; Yogamani, S. Weather and light level classification for autonomous driving: Dataset, baseline and active learning. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 2816–2821. [Google Scholar]
  12. Kondapally, M.; Kumar, K.N.; Vishnu, C.; Mohan, C.K. Towards a Transitional Weather Scene Recognition Approach for Autonomous Vehicles. IEEE Trans. Intell. Transp. Syst. 2023, 25, 5201–5210. [Google Scholar]
  13. Ibrahim, M.R.; Haworth, J.; Cheng, T. WeatherNet: Recognising weather and visual conditions from street-level images using deep residual learning. ISPRS Int. J. Geo-Inf. 2019, 8, 549. [Google Scholar] [CrossRef]
  14. Xia, J.; Xuan, D.; Tan, L.; Xing, L. ResNet15: Weather recognition on traffic road with deep convolutional neural network. Adv. Meteorol. 2020, 2020, 6972826. [Google Scholar] [CrossRef]
  15. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Bangalore, India, 20–21 May 2016; pp. 770–778. [Google Scholar]
  16. Cao, Y.; Li, C.; Peng, Y.; Ru, H. MCS-YOLO: A multiscale object detection method for autonomous driving road environment recognition. IEEE Access 2023, 11, 22342–22354. [Google Scholar]
  17. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
  18. Pan, G.; Fu, L.; Yu, R.; Muresan, M.I. Winter Road Surface Condition Recognition Using a Pre-Trained Deep Convolutional Neural Network. In Proceedings of the Transportation Research Board 97th Annual Meeting, Washington, DC, USA, 7–11 January 2018; pp. 838–855. [Google Scholar]
  19. Wang, R.; Zhao, H.; Xu, Z.; Ding, Y.; Li, G.; Zhang, Y.; Li, H. Real-time vehicle target detection in inclement weather conditions based on YOLOv4. Front. Neurorobotics 2023, 17, 34. [Google Scholar]
  20. Li, X.; Wu, J. Extracting High-Precision Vehicle Motion Data from Unmanned Aerial Vehicle Video Captured under Various Weather Conditions. Remote Sens. 2022, 14, 5513. [Google Scholar] [CrossRef]
  21. Liu, W.; Ren, G.; Yu, R.; Guo, S.; Zhu, J.; Zhang, L. Image-adaptive YOLO for object detection in adverse weather conditions. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 27 February–2 March 2022; Volume 36, pp. 1792–1800. [Google Scholar]
  22. Huang, S.C.; Le, T.H.; Jaw, D.W. DSNet: Joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2623–2633. [Google Scholar]
  23. Sharma, T.; Debaque, B.; Duclos, N.; Chehri, A.; Kinder, B.; Fortier, P. Deep learning-based object detection and scene perception under bad weather conditions. Electronics 2022, 11, 563. [Google Scholar] [CrossRef]
  24. Jung, H.K.; Choi, G.S. Improved yolov5: Efficient object detection using drone images under various conditions. Appl. Sci. 2022, 12, 7255. [Google Scholar] [CrossRef]
  25. ABDULGHANI, A.M.A.; DALVEREN, G.G.M. Moving Object Detection in Video with Algorithms YOLO and Faster R-CNN in Different Conditions. Avrupa Bilim Ve Teknoloji Dergisi 2022, 33, 40–54. [Google Scholar] [CrossRef]
  26. Kenk, M.A.; Hassaballah, M. DAWN: Vehicle detection in adverse weather nature dataset. arXiv 2020, arXiv:2008.05402. [Google Scholar]
  27. Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
  28. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Penang, Malaysia, 5–8 November 2017; pp. 4681–4690. [Google Scholar]
  29. Zoph, B.; Cubuk, E.D.; Ghiasi, G.; Lin, T.Y.; Shlens, J.; Le, Q.V. Learning data augmentation strategies for object detection. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 566–583, Part XXVII 16. [Google Scholar]
  30. Volk, G.; Müller, S.; Von Bernuth, A.; Hospach, D.; Bringmann, O. Towards robust CNN-based object detection through augmentation with synthetic rain variations. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC); IEEE: Piscataway, NJ, USA, 2019; pp. 285–292. [Google Scholar]
  31. Park, H.; Yoo, Y.; Seo, G.; Han, D.; Yun, S.; Kwak, N. C3: Concentrated-comprehensive convolution and its application to semantic segmentation. arXiv 2018, arXiv:1812.04920. [Google Scholar]
  32. Walambe, R.; Marathe, A.; Kotecha, K.; Ghinea, G. Lightweight object detection ensemble framework for autonomous vehicles in challenging weather conditions. Comput. Intell. Neurosci. 2021, 2021, 5278820. [Google Scholar] [CrossRef] [PubMed]
  33. Farid, A.; Hussain, F.; Khan, K.; Shahzad, M.; Khan, U.; Mahmood, Z. A fast and accurate real-time vehicle detection method using deep learning for unconstrained environments. Appl. Sci. 2023, 13, 3059. [Google Scholar] [CrossRef]
  34. Musat, V.; Fursa, I.; Newman, P.; Cuzzolin, F.; Bradley, A. Multi-weather city: Adverse weather stacking for autonomous driving. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2906–2915. [Google Scholar]
  35. Tiwari, A.K.; Pattanaik, M.; Sharma, G. Low-light DEtection TRansformer (LDETR): Object detection in low-light and adverse weather conditions. Multimed. Tools Appl. 2024, 1–18. [Google Scholar] [CrossRef]
  36. Özcan, I.; Altun, Y.; Parlak, C. Improving YOLO Detection Performance of Autonomous Vehicles in Adverse Weather Conditions Using Metaheuristic Algorithms. Appl. Sci. 2024, 14, 5841. [Google Scholar] [CrossRef]
  37. Liu, S.; Zhang, H.; Qi, Y.; Wang, P.; Zhang, Y.; Wu, Q. Aerialvln: Vision-and-language navigation for uavs. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 15384–15394. [Google Scholar]
  38. Dai, Z.; Guan, Z.; Chen, Q.; Xu, Y.; Sun, F. Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion. World Electr. Veh. J. 2024, 15, 297. [Google Scholar] [CrossRef]
  39. Liu, H.; Wu, C.; Wang, H. Real time object detection using LiDAR and camera fusion for autonomous driving. Sci. Rep. 2023, 13, 8056. [Google Scholar] [CrossRef] [PubMed]
  40. Du, D.; Qi, Y.; Yu, H.; Yang, Y.; Duan, K.; Li, G.; Zhang, W.; Huang, Q.; Tian, Q. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 370–386. [Google Scholar]
  41. Foszner, P.; Szczesna, A.; Ciampi, L.; Messina, N.; Cygan, A.; Bizon’, B.; Cogiel, M.; Golba, D.; Macioszek, E.; Staniszewski, M. CrowdSim2: An open synthetic benchmark for object detectors. arXiv 2023, arXiv:2304.05090. [Google Scholar]
Figure 1. Main roles of cameras and their systems in AVs.
Figure 1. Main roles of cameras and their systems in AVs.
Electronics 13 03063 g001
Figure 2. Some adverse weather challenges. (a,b) Heavy snow can obscure road lanes. (c) Rain leads to distortion and blurring of captured images. (d,e) A sandstorm can obscure road boundaries and changed the lighting of the scene. (f) Fog makes it difficult for cameras to distinguish objects from their backgrounds.
Figure 2. Some adverse weather challenges. (a,b) Heavy snow can obscure road lanes. (c) Rain leads to distortion and blurring of captured images. (d,e) A sandstorm can obscure road boundaries and changed the lighting of the scene. (f) Fog makes it difficult for cameras to distinguish objects from their backgrounds.
Electronics 13 03063 g002
Figure 3. The scope of this paper is highlighted in blue.
Figure 3. The scope of this paper is highlighted in blue.
Electronics 13 03063 g003
Figure 4. Sequence of our methodology in this paper.
Figure 4. Sequence of our methodology in this paper.
Electronics 13 03063 g004
Figure 5. The DAWN dataset provides various hard weather conditions such as fog, rain, snow, and sand [26].
Figure 5. The DAWN dataset provides various hard weather conditions such as fog, rain, snow, and sand [26].
Electronics 13 03063 g005
Figure 6. Sample of labeled images.
Figure 6. Sample of labeled images.
Electronics 13 03063 g006
Figure 7. Our proposed model.
Figure 7. Our proposed model.
Electronics 13 03063 g007
Figure 8. The DAWN dataset increased from 1027 to 2046 images.
Figure 8. The DAWN dataset increased from 1027 to 2046 images.
Electronics 13 03063 g008
Figure 9. Precision, recall, and mAP at 0.5 of our model after 64 epochs.
Figure 9. Precision, recall, and mAP at 0.5 of our model after 64 epochs.
Electronics 13 03063 g009
Figure 10. F1 score and mAP at 0.5:0.95 of our model after 64 epochs.
Figure 10. F1 score and mAP at 0.5:0.95 of our model after 64 epochs.
Electronics 13 03063 g010
Figure 11. For weather classification, our model scored 74.3% after 64 epochs.
Figure 11. For weather classification, our model scored 74.3% after 64 epochs.
Electronics 13 03063 g011
Figure 12. The model has successfully classified sandy weather and detected objects in the scene.
Figure 12. The model has successfully classified sandy weather and detected objects in the scene.
Electronics 13 03063 g012
Figure 13. The model has successfully classified foggy weather and detected objects in the scene.
Figure 13. The model has successfully classified foggy weather and detected objects in the scene.
Electronics 13 03063 g013
Figure 14. The model has successfully classified rainy weather and detected objects in the scene.
Figure 14. The model has successfully classified rainy weather and detected objects in the scene.
Electronics 13 03063 g014
Table 1. Recent publications in the AV environment eliminate sandy weather from their studies. Moreover, there is a gap of combining weather classification and object detection.
Table 1. Recent publications in the AV environment eliminate sandy weather from their studies. Moreover, there is a gap of combining weather classification and object detection.
PaperWeather ClassificationObject DetectionDrive in SnowDrive in RainDrive in FogDrive in Sand
[11]××
[13]×××
[14]××
[16]××
[18]××××
[19]××
[20]×××
[21]××××
[22]××××
[23]××××
[24]×××
[25]××
Ours
Table 2. Summary of applied augmentations and their impact on image.
Table 2. Summary of applied augmentations and their impact on image.
AugmentationValueImpact
Rotation90 degreeHelps the model to be insensitive to camera orientation
Hue15%Random adjustment of colors
NoiseRandom noiseMore obstacles added to image
Saturation25%Changes the intensity of the pixels
Grayscale15%Converts image to single channel
Brightness15%Image appears lighter
Blur1.25pxAverages pixel values with neighboring ones
Exposure10%Resilient to lighting and camera setting changes
CutoutCut random parts of the imageMore resilient to detect half objects
Table 3. Comparison of image quality with and without augmentation. The results shows an average image quality of 46.59 for the augmented DAWN dataset compared to 42.75 for the original DAWN dataset.
Table 3. Comparison of image quality with and without augmentation. The results shows an average image quality of 46.59 for the augmented DAWN dataset compared to 42.75 for the original DAWN dataset.
SandyFoggySnowyRainyAverage
DAWN images
Quality Score
44.0545.2140.1841.5742.75
Augmented DAWN images
Quality Score
48.7149.8343.1944.6446.59
Table 4. Performance of our model using BottleneckCSP as a backbone and head.
Table 4. Performance of our model using BottleneckCSP as a backbone and head.
Backbone and HeadActivation FunctionEpochBatchmAP
BottleneckCSPSiLU321633.7%
BottleneckCSPSiLU323234.5%
BottleneckCSPSiLU641640.2%
BottleneckCSPSiLU643243.9%
BottleneckCSPSiLU1281655.2%
BottleneckCSPSiLU1283255.6%
BottleneckCSPLeakyRelu321624.0%
BottleneckCSPLeakyRelu323225.1%
BottleneckCSPLeakyRelu641634.7%
BottleneckCSPLeakyRelu643234.9%
BottleneckCSPLeakyRelu1281638.7%
BottleneckCSPLeakyRelu1283245.6%
Table 5. Performance of our model using C3 as a backbone and head.
Table 5. Performance of our model using C3 as a backbone and head.
Backbone and HeadActivation FunctionEpochBatchmAP
C3SiLU321671.8%
C3SiLU323268.6%
C3SiLU641674.6%
C3SiLU643274.1%
C3SiLU1281674.0%
C3SiLU1283273.1%
C3LeakyRelu321664.4%
C3LeakyRelu323267.1%
C3LeakyRelu641662.9%
C3LeakyRelu643263.2%
C3LeakyRelu1281672.9%
C3LeakyRelu1283272.4%
Table 6. Comparison of our result with some recent publications that used the DAWN dataset.
Table 6. Comparison of our result with some recent publications that used the DAWN dataset.
Ref.mAPDatasetAim
[32]32.75%DAWNEnsemble approach to improve object detection in AVs under adverse weather conditions.
[33]Fog 29.66%DAWNModifying YOLO and using several datasets to detect objects in the AV environment.
Rain 41.21%
Snow 43.01%
Sand 24.13%
[34]39.19%DAWNArchitecture for constructing datasets using GAN and CycleGAN.
[35]55.85%DAWNLow-light Detection Transformer (LDETR) to improve detection performance.
[36]72.8%DAWNImproving YOLO using metaheuristic algorithms.
Ours74.6%DAWNModifying YOLO and using DAWN dataset to classify weather and detect objects in the AV environment.
Table 7. Classification results.
Table 7. Classification results.
NumberImageGroundClassified
1Electronics 13 03063 i001Sandy weatherSandy weather
2Electronics 13 03063 i002Foggy weatherFoggy weather
3Electronics 13 03063 i003Rainy weatherRainy weather
4Electronics 13 03063 i004Snowy weatherSnowy weather
5Electronics 13 03063 i005Sandy weatherFoggy weather
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aloufi, N.; Alnori, A.; Basuhail, A. Enhancing Autonomous Vehicle Perception in Adverse Weather: A Multi Objectives Model for Integrated Weather Classification and Object Detection. Electronics 2024, 13, 3063. https://doi.org/10.3390/electronics13153063

AMA Style

Aloufi N, Alnori A, Basuhail A. Enhancing Autonomous Vehicle Perception in Adverse Weather: A Multi Objectives Model for Integrated Weather Classification and Object Detection. Electronics. 2024; 13(15):3063. https://doi.org/10.3390/electronics13153063

Chicago/Turabian Style

Aloufi, Nasser, Abdulaziz Alnori, and Abdullah Basuhail. 2024. "Enhancing Autonomous Vehicle Perception in Adverse Weather: A Multi Objectives Model for Integrated Weather Classification and Object Detection" Electronics 13, no. 15: 3063. https://doi.org/10.3390/electronics13153063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop