Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review

Flores-Calero, Marco; Astudillo, César A.; Guevara, Diego; Maza, Jessica; Lita, Bryan S.; Defaz, Bryan; Ante, Juan S.; Zabala-Blanco, David; Armingol Moreno, José María

doi:10.3390/math12020297

Open AccessReview

Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review

by

Marco Flores-Calero

^1,2,*

,

César A. Astudillo

^3,*

,

Diego Guevara

²

,

Jessica Maza

²

,

Bryan S. Lita

²

,

Bryan Defaz

²

,

Juan S. Ante

²

,

David Zabala-Blanco

⁴

and

José María Armingol Moreno

⁵

¹

Department of Electrical, Electronics and Telecommunications, Universidad de las Fuerzas Armadas, Av. General Rumiñahui s/n, Sangolquí 171103, Ecuador

²

Department of Systems Engineering, I&H Tech, Latacunga 050102, Ecuador

³

Department of Computer Science, Faculty of Engineering, Universidad de Talca, Curicó 3340000, Chile

⁴

Faculty of Engineering Sciences, Universidad Cátolica del Maule, Talca 3466706, Chile

⁵

Department of Systems and Automation Engineering, Universidad Carlos III de Madrid, Av. de la Universidad 30, Leganés, 28911 Madrid, Spain

^*

Authors to whom correspondence should be addressed.

Mathematics 2024, 12(2), 297; https://doi.org/10.3390/math12020297

Submission received: 19 November 2023 / Revised: 6 January 2024 / Accepted: 12 January 2024 / Published: 17 January 2024

(This article belongs to the Special Issue Advanced Methods and Applications with Deep Learning in Object Recognition)

Download

Browse Figures

Versions Notes

Abstract

Context: YOLO (You Look Only Once) is an algorithm based on deep neural networks with real-time object detection capabilities. This state-of-the-art technology is widely available, mainly due to its speed and precision. Since its conception, YOLO has been applied to detect and recognize traffic signs, pedestrians, traffic lights, vehicles, and so on. Objective: The goal of this research is to systematically analyze the YOLO object detection algorithm, applied to traffic sign detection and recognition systems, from five relevant aspects of this technology: applications, datasets, metrics, hardware, and challenges. Method: This study performs a systematic literature review (SLR) of studies on traffic sign detection and recognition using YOLO published in the years 2016–2022. Results: The search found 115 primary studies relevant to the goal of this research. After analyzing these investigations, the following relevant results were obtained. The most common applications of YOLO in this field are vehicular security and intelligent and autonomous vehicles. The majority of the sign datasets used to train, test, and validate YOLO-based systems are publicly available, with an emphasis on datasets from Germany and China. It has also been discovered that most works present sophisticated detection, classification, and processing speed metrics for traffic sign detection and recognition systems by using the different versions of YOLO. In addition, the most popular desktop data processing hardwares are Nvidia RTX 2080 and Titan Tesla V100 and, in the case of embedded or mobile GPU platforms, Jetson Xavier NX. Finally, seven relevant challenges that these systems face when operating in real road conditions have been identified. With this in mind, research has been reclassified to address these challenges in each case. Conclusions: This SLR is the most relevant and current work in the field of technology development applied to the detection and recognition of traffic signs using YOLO. In addition, insights are provided about future work that could be conducted to improve the field.

Keywords:

YOLO; traffic sign detection and recognition; road accidents; systematic literature review; object detection; computer vision

MSC:

68T07; 68T40; 68T45

1. Introduction

Road Traffic Accidents (RTA) are among the leading causes of damage, injuries, or deaths worldwide [1]. These accidents are events that occur on roads and highways involving vehicles. They can be attributed to various factors, including human error, environmental conditions, technical malfunctions, or a combination of these. Furthermore, the World Health Organization (WHO) [2] indicates that diseases generated by traffic accidents ranked eighth in the world in 2018, accounting for 2.5% of all deaths worldwide. Based on data from 2015, the WHO also estimates that around 1.25 million deaths may occur annually.

In the United States, approximately 12.15 million vehicles were involved in crashes in 2019. The number of road accidents per one million inhabitants in this country is forecast to dip down in the next few years, reaching just over 7100 in 2025 [3]. In Europe, between 2010 and 2020, the number of road deaths decreased by 36%. Compared to 2019, when there were 22,800 fatalities, 4000 fewer people lost their lives on EU roads in 2020 [4].

According to Yu et al. [1], many studies have focused on traffic safety, including traffic accident analysis, vehicle collision detection, collision risk warning, and collision prevention. In addition, several intelligent systems have been proposed that specialize in traffic sign detection and recognition using computer vision (CV) and deep learning (DL). In this context, one of the most popular technologies is the YOLO object detection algorithm [5,6,7,8,9,10,11,12,13].

This article presents an SLR on the detection and recognition of traffic signs using the YOLO object detection algorithm. In this context, a traffic sign serves as a visual guide to convey information about road conditions, potential hazards, and other essential details for safe road navigation [14]. Meanwhile, YOLO, a model based on convolutional neural networks, is specifically designed for object detection [6]. This algorithm has been selected because of its competitiveness compared to other methods based on DL, with respect to processing speed on GPUs, high performance rates regarding the most critical metrics, and simplicity [15,16,17].

Figure 1 shows the global scheme of this type of system. An input image captured with a camera is fed to the YOLO object detection algorithm, and through a deep convolutional neural network it detects objects and outputs isolated traffic signs when appropriate. Subsequently, it provides pertinent information to the driver (or the autonomous driving system) to make driving safer, more efficient, and comfortable.

The main contributions of this SLR are to gather evidence to answer the following five research questions (RQs): (1) what are the main applications of traffic sign detection and recognition using YOLO? (2) What traffic sign datasets are used to train, validate, and test these systems? (3) What metrics are used to measure the quality of object detection in the context of traffic sign detection and recognition using YOLO? (4) What hardware is used to implement traffic sign recognition and detection systems based on YOLO? (5) What are the problems and challenges encountered in the detection and recognition of traffic signs using YOLO?

The remainder of the paper is organized as follows. This initial section outlines the primary objective of this study. Section 2 details the materials and methods used in this review. Section 3 focuses on the results of this study. Section 4 exhibits the research and practical implications. Finally, the last part is devoted to conclusions and future work.

2. Materials and Methods

This section provides details and guidelines essential for executing a targeted SLR focused on the detection and recognition of traffic signs, employing the YOLO object detection algorithm.

2.1. Traffic Signs

Traffic signs are visual cues and symbols that are placed on public roads to warn, inform, order, or regulate the behavior of road users, especially in densely populated and busy urban areas. They contain a simple visual symbolic language so that the driver can interpret and instantly obtain information from the road for safe driving [5].

Traffic signs are typically made of reflective materials that are visible at night and under low-light conditions. The reflective design not only enhances safety by facilitating night-time visibility but also ensures that drivers can easily discern and understand the intended messages. Each sign conveys a unique message and is distinguished by shape, color, and size, aligning with specific road directives and offering drivers accurate and effective communication, contributing to an overall safer and well-regulated traffic environment [14].

Among the multitude of characteristics, two stand out, namely shape and color, from which traffic signs can be grouped into three types: prohibitive, preventive, and informative [19,20]. Prohibitive (or regulatory) signs inform the driver of the restrictions he must comply with; they are often circular and red in color. Preventive (or warning) signs are warning signals for possible dangers on the road, generally yellow diamonds. Informative (or indicative) signs are designed to assist drivers in navigation tasks. Typically, these signs are rectangular and colored green or blue, providing essential information for route guidance.

The visual appearance of traffic signs exhibits significant variability depending on the country, posing a challenge for classification systems to achieve success. Obviously, this represents a drawback for the development of global traffic sign detection and recognition systems and limits their development to certain types of signs or countries [19]. Other challenging conditions are illumination changes, occlusions, perspectives, weather conditions, aging, blur, and human artifacts. Under such extreme conditions, all methods are unable to complete the detection task efficiently.

2.2. YOLO Object Detection Algorithm

YOLO is a state-of-the-art technology developed for object detection based on DL, with emphasis on real-time and high-accuracy applications [21]. In YOLO, object detection is treated as a regression problem where candidate images and their categories and confidence indices are directly generated by regression. The detection result is finally determined by setting a threshold of the confidence rate and the non-maximum suppression technique [6].

The primary advantage of YOLO lies in its capability for real-time image processing, making it well-suited for applications like autonomous vehicles and Advanced Driver Assistance Systems (ADAS) [18]. Moreover, YOLO achieves state-of-the-art accuracy with limited training data, surpassing other methods. Additionally, its ease of implementation and user-friendliness contribute to its popularity in the field of computer vision (CV).

In contrast, several authors have identified some limitations and handicaps. One important limitation is that YOLO struggles to detect small objects. The algorithm divides an image into grid cells and detects objects within these cells. Objects smaller than these predefined areas may be missed by the algorithm. Another problem is that YOLO does not consider the semantics of the image, ignoring the meaning of the visual data. Certain versions of YOLO are pre-trained using academic datasets, not considering real data that may be blurry, contain obstructed objects, and generally have a low resolution [22].

The first models were implemented by Redmon et al. [21], Redmon and Farhadi [23,24], starting with the first version in 2016, using the convolutional neural network called DarkNet [25]; followed by the second known as YOLO9000 [23] in 2017, using Darknet-19; ending the saga in 2018 with YOLOv3 [24], using Darknet-53. The fourth version corresponds to Bochkovskiy et al. [26,27] released in April 2020, also uses CSPDarknet-53. The fifth version was released in May 2020, by Jocher and the company Ultralytics [28]; this variant uses CSPNet as a neural network. YOLOv5 comes in different versions, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x.

It is important to mention that this research has only considered the five fundamental versions of the YOLO object detection algorithm. Other variants have been compressed into the original models, for example, the YOLO tiny.

2.3. State of the Art

Presently, there is a limited corpus of literature specifically addressing the detection and recognition of traffic signs utilizing systematic review or survey methodologies. Nonetheless, noteworthy contributions exist, particularly those investigating machine learning methods and their integration in the advancement of ADAS [18] or autonomous vehicles devices. Liu et al. [5] have presented a study in which traffic sign detection methods are grouped into five categories: color-based methods, shape-based methods, color- and shape-based methods, machine-learning-based methods, and LIDAR-based methods. They have also shown that mobile laser scanning technology has experienced significant growth in the last five years and has been a key solution in many ADAS. Interestingly, the prevalence of YOLO-based systems is scarcely acknowledged within the scope of their findings. Wali et al. [29] have indicated that the investigation of automatic traffic sign detection and recognition systems is a crucial research area targeted at advancing CV-based ADAS. Additionally, this article provides a comprehensive examination of methods for detecting, classifying, and tracking traffic signs using machine learning. Notably, the use of YOLO is not explicitly mentioned. On the other hand, Borrego-Carazo et al. [30] have conducted a systematic review of machine learning techniques and embedded systems for implementing ADAS on mobile devices. Their examination extends to a marginal analysis of traffic sign recognition within the framework of machine-learning methodologies, with particular emphasis on the absence of any reference to the YOLO algorithm, a notable omission in the contemporary discourse on such systems. In their contribution, Muhammad et al. [31] have conducted a comprehensive analysis elucidating the challenges and prospective trajectories within the domain of DL applications for autonomous vehicle development. Within this framework, particular emphasis is placed on sign detection systems tailored for traffic, acknowledged as pivotal components in the evolution of future vehicles. It is noteworthy that the YOLO algorithm garners substantial attention, especially concerning its application in the realm of traffic light detection.

Concluding this study, Diwan et al. [16] have presented the fundamental architectures, applications, and challenges associated with various object detectors based on DL neural networks. The authors posit that YOLO manifests notable superiority, particularly in terms of both detection accuracy and inference time. Unfortunately, the study lacks specific information pertaining to YOLO-based traffic sign detection systems.

2.4. Systematic Literature Review Methodology

This section presents an overview of the planning, conducting, and reporting a review following the SLR process outlined in [32].

An SLR is used to identify, select, and systematically evaluate all relevant evidence on a specific topic. The SLR process searches for various databases and sources of information, using rigorous selection criteria to identify relevant studies and assess the quality of those studies. An SLR is considered a rigorous and objective way of summarizing and analyzing existing evidence on a topic and is frequently utilized in areas such as health and social sciences to inform research and decisionmaking.

SLRs have applications in different fields of engineering. For example, they can help engineers identify the latest trends and developments in a specific field, analyze the effectiveness of different techniques and approaches used in previous projects, and identify potential problems or challenges in a particular engineering area. SLRs can also help compare different studies and assess the quality of existing evidence on a specific topic. Overall, SLRs are a valuable tool for engineering research and decisionmaking [33].

The review protocol followed in this investigation consists of six stages: defining the research questions, designing a search strategy, selecting the studies, measuring the quality of the articles, extracting the data, and synthesizing the data [33].

First, we develop a set of research questions based on the objective of this SLR. Next, we develop a search strategy to find studies relevant to our research questions. This step involved defining the scientific databases to be used in the process and the search terms. In the third stage, we specify the criteria for determining articles that address the research questions and those discarded from the study. As part of this phase, pilot studies were conducted to establish inclusion and exclusion criteria more effectively. In the next stage, the quality of the articles is quantified to determine whether they met the minimum standards for inclusion in the review. The following steps considered data extraction and synthesis. During the data extraction stage, a pilot process was also performed to determine which data to extract and how to store them for subsequent synthesis. Finally, in the synthesis stage, we decided on strategies to generate syntheses depending on the type of data analyzed [34].

Following a protocol for SLR is vital to ensure rigor, reproducibility, and minimize investigator bias. In the remainder of this section, details of the review protocol are presented in detail [34].

2.5. Research Questions

The RQs are derived from the general objective of the investigation and structure the literature search, determining the selection criteria for including or excluding studies in the review. The RQs are also used to guide the interpretation and analysis of the review results. The following are the RQs formulated in this investigation.

RQ1 [Applications]: What are the main applications of traffic sign detection and recognition using YOLO?

Knowing the applications of traffic sign detection helps researchers and developers identify areas of opportunity and challenges in the field of CV. This can lead to new advances and improvements in traffic sign detection technology. Additionally, knowing the applications of traffic sign detection can also be useful for users of the technology as it provides a better understanding of how traffic sign detection can be employed.

RQ2 [Datasets]: What traffic sign datasets are used to train, validate, and test these systems?

For several reasons, it is essential to know the datasets used to validate the detection and recognition of traffic signs. First, the datasets provide a source of information to evaluate the accuracy and effectiveness of detection methods. Second, datasets also allow us to observe the behavior of recognition systems in different situations and with different types of data, which allows us to improve and optimize the application. Furthermore, knowledge of the datasets makes it possible to identify potential application problems or errors and correct them before launching them on the market.

RQ3 [Metrics]: What metrics are used to measure the quality of object detection in the context of traffic sign detection and recognition using YOLO?

Performance metrics to measure the accuracy of object detection are crucial for the proper functioning of traffic sign detection and recognition systems. If object detection is not accurate, these systems may not function properly and may even be dangerous to humans. Additionally, these performance measures are useful to compare different object detection systems and algorithms and determine which one is the most suitable for a particular application. Finally, knowing appropriate metrics allows one to identify errors in an object detection system and to develop solutions to improve its accuracy.

RQ4 [Hardware]: What hardware is used to implement traffic sign recognition and detection systems based on YOLO?

Hardware is a crucial element in traffic recognition systems as it provides the processing and storage capacity needed to analyze and process the video images captured by the cameras. Without the right hardware, traffic recognition systems would not be able to function properly and would not be able to detect and track objects in real time. Additionally, hardware also plays an important role in the speed and efficiency of traffic recognition and detection systems as more powerful hardware can process and analyze images faster and more accurately.

RQ5 [Challenges]: What are the problems and challenges encountered in the detection and recognition of traffic signs using YOLO?

Being aware of the problems and challenges of traffic sign detection and recognition using YOLO helps researchers and developers to identify areas where the technology needs to be improved. Also, knowing these problems and challenges enables better understanding of its limitations and potential drawbacks. This helps to make informed decisions about the appropriate use of the technology mentioned above.

2.6. Search Strategy

This section presents the steps used to identify, select, and evaluate relevant studies for inclusion in the review. It includes a description of databases and sources of information, keywords, and inclusion and exclusion criteria applied to identify relevant studies.

2.6.1. Databases of Digital Library

In the context of an SLR, a database is a collection of published research articles that can be searched using specific keywords. These databases are typically used to identify relevant studies to be included in the review. The selected databases for this study are IEEE Xplore, MDPI, Plos, Science Direct, Wiley, Sage, Hindawi Publishing Group, Taylor & Francis, and Springer Nature, all of which contain articles indexed in Web of Science (WoS) and/or Scopus.

WoS serves as a robust research database and citation index, offering a widely utilized platform that grants access to an extensive array of scholarly articles, conference proceedings, and various research materials. Conversely, Scopus, produced by Elsevier Co., is an abstract and indexing database with embedded full-text links.

Furthermore, IEEE Xplore, MDPI, Plos, Science Direct, Wiley, Sage, Hindawi Publishing Group, Taylor & Francis, and Springer Nature are renowned digital libraries. These platforms provide access to an expansive collection of scientific and technical content, encompassing disciplines such as electrical engineering, computer science, electronics, and other related fields.

2.6.2. Timeframe of Study

This study encompasses documents published from 2016 through the end of 2022, extending to early publications of 2023. The selection of 2016 as the starting point aligns with the initial release of the YOLO algorithm.

2.6.3. Keywords

The primary keywords used to search for relevant studies are YOLO, traffic sign, recognition, detection, identification, and object detection.

Later on, the search strings formulated with these keywords will be presented for each digital library.

2.6.4. Inclusion and Exclusion Criteria

To ensure that the SLR concentrates on high-quality, relevant studies, specific inclusion and exclusion criteria are defined.

Inclusion Criteria:

Studies must evaluate traffic sign detection or recognition using the YOLO object detection algorithm.
Only studies published between 2016 and 2022 are considered.
The study should be published in a peer-reviewed journal or conference proceedings.
Preference is given to documents categorized as “Journal” or “Conference” articles.
The study must be in English.

Exclusion Criteria:

Studies that do not utilize the YOLO object detection algorithm for traffic sign detection or recognition.
Research not focused on traffic sign detection or recognition.
Publications outside the 2016–2022 timeframe.
Non-peer-reviewed articles and documents.
Studies published in languages other than English.

2.6.5. Study Selection

This section outlines the search and selection methodology used in this study, detailing the progression from the initial number of articles identified to the final selection of studies included in the review.

In the initial phase, a keyword-based search across nine bibliographic databases yielded a total of 594,890 documents. The search strings employed, along with the number of articles identified in each digital library, are detailed in Table 1.

The search was then narrowed to include only publications from 2016 to 2022, with early access articles from 2023 also considered. This refinement resulted in the exclusion of 369,215 documents. Further filtering focused on documents classified under ‘Journals’ or ‘Conferences’, leading to the elimination of an additional 15,020 documents. Priority was given to articles employing YOLO as the primary detection method, which further narrowed the field by excluding 209,378 documents. After these initial stages, 1448 documents remained across all databases. Additionally, 171 articles were specifically sourced from the WoS and Scopus.

A thorough curation process followed. This involved removing duplicates (755 documents), and applying stringent inclusion and exclusion criteria, which led to the further removal of 364 documents due to keyword-related issues, 177 for abstract-related discrepancies, and 37 for lack of sufficient information. The culmination of this meticulous process resulted in a final selection of 115 documents, which comprise the corpus for analysis in this SLR.

2.7. Data Extraction

The data extraction step in SLR is a process of collecting relevant information from the primary studies that meet the inclusion and exclusion criteria. The purpose of this step is to synthesize and analyze the data to answer the research question and validate the results of the SLR.

To determine YOLO’s practical applications, we applied a rigorous evaluation framework, guided by experts. This approach revealed three main domains: road safety, ADAS, and autonomous driving. When an article did not explicitly mention its application domain, we conducted a thorough analysis to deduce the specific context.

To identify relevant datasets, a comprehensive keyword search for ‘data set’ was performed within each article. Upon locating a dataset, its relevance was carefully evaluated before meticulously recording it in the extraction table.

Furthermore, a rigorous review of each article was undertaken to pinpoint and document the specific metrics utilized. These metrics were then systematically integrated into the extraction table, ensuring comprehensive documentation and analysis.

Additionally, we conducted a thorough examination of the hardware configurations employed in each study, systematically recording these details in the extraction table. This approach facilitated the identification of prevalent and frequently used hardware components.

To delineate the challenges intrinsic to YOLO technology, we applied an expert-defined criterion, outlining seven distinct categories: variations in lighting conditions, adverse weather conditions, partial occlusion, signs with visible damage, complex scenarios, sub-optimal image quality, and region-specific concerns. In numerous instances, the application domain may remain implicit, necessitating a detailed analysis of each article to deduce the specific challenge at hand.

2.8. Data Synthesis

The data synthesis step in SLR is the process of gathering and interpreting the relevant data extracted from the selected studies to answer the research questions of the SLR.

First, we categorized real-world YOLO applications, providing a clear overview of deployment areas. Next, we aggregated and cataloged referenced datasets, noting their specific characteristics and providing a comprehensive view of data sources. The metrics employed in various studies were synthesized, revealing prevalent evaluation methodologies. Also, hardware configurations were analyzed, uncovering prevailing trends and supporting the technological landscape. Moreover, challenges in YOLO technology were categorized, shedding light on limitations and areas for improvement.

This synthesis process was crucial in distilling key findings and trends from the extensive literature, offering a nuanced understanding of YOLO technology and its real-world applications.

3. Results Based on RQs

In this section, a critical analysis is performed on 115 primary studies based on the previously described RQs and five different aspects: applications, sign datasets, metrics, hardware, and challenges.

3.1. RQ1 [Applications]: What Are the Main Applications of Traffic Sign Detection and Recognition Using YOLO?

The search revealed three main applications of YOLO: road safety, ADAS, and autonomous driving.

Road Safety: Road safety refers to the efforts used to reduce the likelihood of collisions and protect road users. This includes different efforts and legislation aimed at encouraging safe driving practices, improving road infrastructure, monitoring road conditions, identifying road dangers, and improving traffic management and vehicle safety [35,36,37,38,39,40,41,42,43].

For example, YOLO can be implemented in traffic cameras to identify and evaluate congestion, traffic flow, and accidents, which can influence decisionmaking to improve traffic management and reduce the probability of accidents. Furthermore, YOLO can be integrated with intelligent transportation systems to monitor the movements of pedestrians and cyclists and improve road safety for non-motorized road users.

ADAS: Advanced Driver Assistance Systems (ADAS) are technologies designed to enhance road safety and the driving experience. They utilize a combination of sensors, cameras, and advanced algorithms to aid drivers in various driving tasks [30,44].

ADAS can utilize YOLO to detect and recognize objects in real-time video streams [6]. In this context, YOLO can be applied to detect and recognize traffic signs [7,8,9,10,18,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86].

Autonomous Driving: Autonomous driving is significantly dependent on CV technologies to perceive and evaluate the driving environment. Using cameras and algorithms, CV systems provide relevant information to autonomous vehicles about their surroundings, such as the position and behavior of other cars, pedestrians, and transport networks. These data are used for decisionmaking, vehicle control, and safe road navigation. To develop autonomous driving systems, a vehicle should generally be equipped with numerous sensors and communication systems [11,12,13,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136].

3.2. RQ2 [Datasets]: What Traffic Sign Datasets Are Used to Train, Validate, and Test These Systems?

Traffic sign datasets are collections of images that contain traffic signs and their annotations. They are used for training, validating, and testing different proposed traffic sign detection and recognition systems.

There are many traffic sign datasets that have been created and used by researchers and practitioners. Some of them are publicly available, while others are private, i.e., restricted to the scientific community. The main datasets that have been used are as follows:

The German Traffic Sign Detection Benchmark (GTSDB) and the German Traffic Sign Recognition Benchmark (GTSRB): The GTSDB and GTSRB datasets are two popular resources for traffic sign recognition research. They contain high-quality images of various traffic signs taken in real-world scenarios in Germany. The images cover a wide range of scenery, times of day, and weather conditions, making them suitable for testing the robustness of recognition algorithms. The GTSDB dataset consists of 900 images, split into 600 for training and 300 for validation. The GTSRB dataset is larger, with more than 50,000 images of 43 classes of traffic signs, such as speed limit signs, stop signs, yield signs, and others. Images are also annotated with bounding boxes and class labels. Both datasets are publicly available and have been used in several benchmarking studies [7,13,37,38,40,46,47,50,52,53,55,58,59,60,63,64,65,66,68,69,70,71,72,77,80,82,92,96,101,103,104,106,109,112,114,115,117,118,125,127,129,130,136,137].
Tsinghua Tencent 100K (TT100K): The TT100K dataset is a large-scale traffic sign benchmark created by Tsinghua University and Tencent. It consists of 100,000 images from Tencent Street View panoramas, which contain 30,000 traffic sign instances. The images vary in lighting and weather conditions, and each traffic sign is annotated with its class label, bounding box, and pixel mask. The dataset is suitable for traffic sign detection and classification tasks in realistic scenarios. The TT100K dataset is publicly available and can be used for both traffic sign detection and classification tasks [35,38,41,45,51,65,67,80,84,94,95,102,105,108,109,111,113,121,124,128,131,133,134,135].
Chinese Traffic Sign Dataset (CTSDB y CCTSDB): The CTSDB and CCTSDB datasets are two large-scale collections of traffic sign images for CV research. The CTSDB dataset consists of 10,000 images captured from different scenes and angles, covering a variety of types and shapes of traffic signs. The CCTSDB dataset is an extension of the CTSD dataset, with more than 20,000 images that contain approximately 40,000 traffic signs. The CCTSDB dataset also includes more challenging scenarios, such as occlusion, illumination variation, and scale change [8,18,36,39,60,65,68,81,82,83,85,87,108,116,130,132,135,136,138].
Belgium Traffic Sign Detection Benchmark and Belgium Traffic Sign Classification Benchmark (BTSDB y BTCDB): The BTSDB dataset, specifically designed for traffic sign detection in Belgium, comprises a total of 7095 images. These images are further divided into 4575 training images and 2520 testing images. The dataset encompasses a diverse range of image sizes, spanning from 11 × 10 pixels to 562 × 438 pixels. The Belgium Traffic Sign Classification Benchmark is a dataset of traffic sign images collected from eight cameras mounted on a van. The dataset contains 62 types of traffic signs and is divided into training and testing sets. The dataset is useful for evaluating traffic sign recognition algorithms, which are essential for intelligent transport systems and autonomous driving. The dataset also provides annotations, background images, and test sequences for further analysis [55,78,101,123].
Malaysian Traffic Sign Dataset (MTSD): The MTSD includes a variety of traffic sign scenes to be used in traffic sign detection, having 1000 images with different resolutions (FHD $1920 \times 977$ pixels; 4K-UHD $3840 \times 2160$ pixels; UHD+ $4592 \times 3448$ pixels). It also has 2056 images of traffic signs, divided into five categories, for recognition [11,118].
Korea Traffic Sign Dataset (KTSD): This dataset has been used to train and test various deep learning architectures, such as YOLOv3 [57], to detect three different classes of traffic signs: prohibitory, mandatory, and danger. The KTSD contains 3300 images of various traffic signs, captured from different roads throughout South Korea. These images feature traffic signs of varying sizes, providing a diverse and comprehensive dataset for traffic sign detection and recognition research [57,59,64].
Berkley Deep Drive (BDD100K): The Berkeley DeepDrive (BDD) project has released a large-scale and diverse driving video dataset called BDD100K. It contains 100,000 videos with rich annotations to evaluate the progress of image recognition algorithms on autonomous driving. The dataset is available for research purposes and can be downloaded from the BDD website (https://bdd-data.berkeley.edu/, accessed on 12 April 2023). The images in the dataset are divided into two sets: one for training and one for validation. The training set contains 70% of the images, while the validation set contains the remaining 30% [55,90].
Thai (Thailand) Traffic Sign Dataset (TTSD): The data collection process takes place in the rural areas of Maha Sarakham and Kalasin Provinces within the Kingdom of Thailand. It encompasses 50 distinct classes of traffic signs, each comprising 200 unique instances, resulting in a comprehensive sign dataset that comprises a total of 9357 images [101,126].
Swedish Traffic Sign Dataset (STSD): This public sign dataset comprises 20,000 images, with 20% of them labeled. Additionally, it contains 3488 traffic signs from Sweden [104].
DFG Traffic Sign Dataset (DFG): The DFG dataset comprises approximately 7000 traffic sign images captured from highways in Slovenia. These images have a resolution of $1920 \times 1080$ pixels. To facilitate training and evaluation, the dataset is divided into two subsets, with 5254 images designated for training and the remaining 1703 images for validation. The dataset features a total of 13,239 meticulously annotated instances in the form of polygons, each spanning over 30 pixels. Additionally, there are 4359 instances with less precise annotations represented as bounding boxes, measuring less than 30 pixels in width [12].
Taiwan Traffic Sign Dataset (TWTSD: The TWTSD dataset comprises 900 prohibitory signs from Taiwan with a resolution of $1920 \times 1080$ pixels. The training and validation subsets are divided into 70% and 30%, respectively [75].
Taiwan Traffic Sign (TWSintetic): The Taiwan Traffic Sign (TWSintetic) dataset is a collection of traffic signs from Taiwan, consisting of 900 images, and it has been expanded using generative adversarial network techniques [9].
Belgium Traffic Signs (KUL): The KUL dataset encompasses over 10,000 images of traffic signs from the Flanders region in Belgium, categorized into more than 100 distinct classes [89].
Chinese Traffic Sign Detection Benchmark (CSUST): The CSUST dataset comprises over 15,000 images and is continuously updated to incorporate new data [8].
Foggy Road Image Database (FRIDA): The Foggy Road Image Database (FRIDA) contains 90 synthetic images from 18 scenes depicting various urban road settings. In contrast, FRIDA2 offers an extended collection, with 330 images derived from 66 road scenes. For each clear image, there are corresponding counterparts featuring four levels of fog and a depth map. The fog variations encompass uniform fog, heterogeneous fog, foggy haze, and heterogeneous foggy haze [71,114].
Foggy ROad Sign Images (FROSI): The FROSI is a database of synthetic images easily usable to evaluate the performance of road sign detectors in a systematic way in foggy conditions. The database contains a set of 504 original images $1400 \times 600$ with 1620 road signs (speed and stop signs, pedestrian crossing) placed at various ranges, with ground truth [71,114].
MarcTR: This dataset contains seven traffic sign classes, collected by using a ZED stereo camera mounted on top of Racecar mini autonomous car [79].
Turkey Traffic Sign Dataset: The Turkey Traffic Sign Dataset is an essential resource for the development of traffic and road safety technologies, specifically tailored for the Turkish environment. It comprises approximately 2500 images, including a diverse range of traffic signs, pedestrians, cyclists, and vehicles, all captured under real-world conditions in Turkey [77].
Vietnamese Traffic Sign Dataset: This comprehensive dataset encompasses 144 classes of traffic signs found in Vietnam, categorized into four distinct groups for ease of analysis and application. These include 40 prohibitory or restrictive signs, 47 warning signs, 10 mandatory signs, and 47 indication signs, providing a detailed overview of the country’s traffic sign system [76].
Croatia Traffic Sign Dataset: This dataset consists of 28 video sequences at 30 FPS with a resolution of $720 \times 480$ pixels. They were taken in the urban traffic of the city of Osijek, Croatia [10].
Mexican Traffic Sign Dataset: The dataset consists of 1284 RGB images, featuring a total of 1426 traffic signs categorized into 11 distinct classes. These images capture traffic signs from a variety of perspectives, sizes, and lighting conditions, ensuring a diverse and comprehensive collection. The traffic sign images were sourced from a range of locations including avenues, roadways, parks, green areas, parking lots, and malls in Ciudad Juárez, Chihuahua, and Monterrey, Nuevo Leon, Mexico, providing a broad representation of the region’s signage [43].
WHUTCTSD: It is a more recent dataset with five categories of Chinese traffic signs, including prohibitory signs, guide signs, mandatory signs, danger warning signs, and tourist signs. Data were collected by a camera at a $1920 \times 1080$ pixel resolution during different time periods. It consists of 2700 images, which were extracted from videos collected in Wuhan, Hubei, China [62].
Bangladesh Road Sign 2021 (BDRS2021): This dataset consists of 16 classes. Each class consists of 168 images of Bangladesh road signs [69], offering a rich source of data that capture the specific traffic sign environment of Bangladesh, including its urban, rural, and varied geographical landscapes.
New Zealand Traffic Sign 3K (NZ-TS3K): This dataset is a specialized collection focused on traffic sign recognition in New Zealand [70]. It features over 3000 images, showcasing a wide array of traffic signs commonly found across the country. These images are captured in high resolution (1080 by 1440 pixels), providing clear and detailed visuals essential for accurate recognition and analysis. The dataset is categorized into multiple classes, each representing a different type of traffic sign. These include Stop (236 samples), Keep Left (536 samples), Road Diverges (505 samples), Road Bump (619 samples), Crosswalk Ahead (636 samples), Give Way at Roundabout (533 samples), and Roundabout Ahead (480 samples), offering a diverse range of signs commonly seen on Auckland’s roads.
Mapillary Traffic Sign Dataset (MapiTSD): The Mapillary Traffic Sign Dataset is an expansive and diverse collection of traffic sign images, sourced globally from the Mapillary platform’s extensive street-level imagery. It features millions of images from various countries, each annotated with automatically detected traffic signs. This dataset is characterized by its wide-ranging geographic coverage and diversity in environmental conditions, including different lighting, weather, and sign types. Continuously updated, it provides a valuable up-to-date resource for training and validating traffic sign recognition algorithms [38].
Specialized Research Datasets: These datasets consist of traffic sign data compiled by various authors. Generally, they lack detailed public information and are not openly accessible. This category includes datasets from a variety of countries: South Korea [91,103], India [49,99], Malaysia [97], Indonesia [98], Slovenia [54], Argentina [139], Taiwan [74,107,140], Bangladesh [69], and Canada [61]. Each dataset is tailored to its respective country, reflecting the specific traffic signs and road conditions found there.
Unknown or General Databases (Unknown): Consist of those datasets that do not have any certain information on the subject of traffic [22,42,64,73,86,88,93,99,100,110,119,122,141], or directly constitute general databases such as MSCOCO [63,73,99], KITTI [132], or those that are downloaded from repositories such as Kaggle [42].

Table 2 presents the distribution of traffic sign datasets based on their country of origin, database name, number of categories and classes, number of images, and the researchers utilizing them.

A substantial number of articles in this study utilize publicly available databases, known for their convenience in model evaluation and result comparison. MapiTSD, BDD100K, and TT100K are prominent due to their extensive image datasets, while GTSDB and GTSRB are the most cited, comprising 27.33% of the citations, followed by TT100K, with 16.15% of the citations.

3.3. RQ3 [Metrics]: What Metrics Are Used to Measure the Quality of Object Detection in the Context of Traffic Sign Detection and Recognition Using YOLO?

Various performance metrics are frequently employed to measure the efficacy of traffic sign recognition systems. This section provides a summary of the metrics utilized most frequently in the articles under consideration.

In assessing the computational efficacy and real-time performance of traffic sign detection systems and other object recognition applications, frames per second (FPS) is a commonly employed metric. FPS, detailed in Equation (1), measures the number of video frames per second that a system can process, providing valuable insight into the system’s responsiveness to changing traffic conditions. Achieving a high FPS is essential for deployment in the real world because it ensures timely and accurate responses in dynamic environments such as autonomous vehicles and traffic management systems. Researchers evaluate FPS by executing the system on representative datasets or actual video streams, taking into account factors such as algorithm complexity, hardware configuration, and frame size. FPS, when combined with other metrics such as mAP, precision, and recall, contributes to a comprehensive evaluation of the system’s overall performance, taking into account both accuracy and speed, which are crucial considerations for practical applications.

FPS = \frac{# of frames}{seconds}

(1)

Accuracy (ACC) is a fundamental and often used metric in the field of traffic sign detection and object recognition systems to evaluate their performance. The accuracy of traffic sign identification, as measured by the ACC metric, and specified in Equation (2), is determined by the proportion of correctly recognized traffic signs, including both true positives (TP) and true negatives (TN), relative to the total number of traffic signs. This core statistic offers vital insights into the overall efficacy of the system in accurately categorizing items of interest. The accuracy of predictions, as measured by the ACC metric, serves as a dependable indicator of the system’s capacity to differentiate between various kinds of traffic signs. ACC is widely used by researchers and practitioners to fully assess and compare different detection algorithms and models, thereby facilitating the progress and enhancement of technology related to traffic sign recognition.

ACC = \frac{TP + TN}{TP + FP + TN + FN}

(2)

Precision and recall are crucial metrics used to evaluate the performance of algorithms in the context of traffic sign detection and recognition. Precision is a metric that quantifies the proportion of correctly identified positive instances in relation to the total number of instances identified as being positive. It serves as an indicator of the accuracy of positive predictions and aims to minimize the occurrence of false positives. On the other hand, recall, which is sometimes referred to as sensitivity or true positive rate, measures the proportion of true positives in relation to the total number of positive instances, thereby reflecting the model’s capacity to identify all pertinent instances and reduce the occurrence of false negatives. These complementing measures have significant value in situations where the importance of accuracy and completeness is paramount. They allow researchers to strike a compromise between the system’s capacity to produce precise predictions and its ability to minimize instances of missed detections. By achieving a harmonious equilibrium, precision and recall offer a thorough assessment of the total efficacy of the system in the realm of traffic sign recognition, hence augmenting the advancement and implementation of dependable recognition systems. The mathematical expressions for precision and recall are detailed in Equations (3) and (4), respectively.

Precision = \frac{TP}{TP + FP}

(3)

Recall = \frac{TP}{TP + FN}

(4)

The F1 score is a crucial statistic in the field of traffic sign detection and identification since it effectively represents the equilibrium between precision and recall by providing a consolidated number. The F1 score is calculated as the harmonic mean of two measures, providing a comprehensive evaluation of algorithm performance. It takes into account the accuracy of positive predictions as well as the system’s capability to correctly identify all relevant cases. This statistic is particularly valuable in scenarios characterized by unbalanced datasets or disparate class distributions. The F1 score enables researchers and practitioners to thoroughly assess the efficacy of the model, facilitating the advancement of resilient and flexible traffic sign recognition systems that achieve an ideal balance between precision and recall. The formula representing the F1 score is as follows:

F 1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(5)

Mean Average Precision (mAP) is the metric used by excellence to evaluate the precision of an object detector. In the context of object detection tasks, mAP is widely recognized as a comprehensive and reliable performance metric. It takes precision and recall at multiple levels of confidence thresholds into account and provides an informative evaluation of the detector’s capabilities. By considering varying recall levels, mAP provides insight into how well the detector performs at various sensitivity levels. The mAP is determined by Equation (6).

m A P = \frac{1}{K} \sum_{k = 1}^{K} A P_{k}

(6)

Intersection over Union (IoU) is another crucial metric utilized in the field of traffic sign detection and recognition. It serves to measure the degree of spatial overlap between the predicted bounding boxes and the ground truth bounding boxes of observed objects. The IoU detailed in Equation (7) quantifies the precision of traffic sign localization by calculating the ratio between the area of intersection and the area of union of the detected boxes. That is, the IoU metric is an important part of figuring out how accurate object boundaries are and how well an algorithm can figure out where something is. This measure holds particular significance in situations where exact object localization is of utmost importance, hence contributing to the progress of reliable traffic sign detection and recognition systems with correct spatial comprehension.

IoU = \frac{B_{gt} \cap B_{pt}}{B_{gt} \cup B_{pt}}

(7)

where the variable

B_{g t}

represents the ground truth bounding box, whereas

B_{p t}

represents the predicted bounding box.

Finally, the Average Precision (AP) measure bears great importance in the field of traffic sign identification and recognition. It provides a thorough assessment of algorithm performance by considering different confidence criteria. The AP is determined by computing the average of accuracy values at various recall levels. This metric effectively represents the balance between making precise positive predictions and including all positive cases. This statistic offers a nuanced viewpoint on the capabilities of a system, particularly in situations when datasets are unbalanced or object class distributions fluctuate. The utilization of AP facilitates the thorough evaluation of the system’s efficiency in detecting and recognizing traffic signs. This, in turn, fosters the advancement of traffic sign recognition systems that are both resilient and adaptable, capable of performing exceptionally well under various operational circumstances. AP is provided by the following expression:

A P = \frac{1}{n_{pos}} \sum_{i = 1}^{n} (Recall (i) - Recall (i - 1)) \cdot Precision (i)

(8)

where

n_{pos}

is the total number of positive instances (ground truth objects), n is the total number of thresholds at which precision and recall values are computed,

recall (i)

represents the recall value at threshold i,

precision (i)

denotes the precision value at threshold i,

recall (0) = 0

and

precision (0)

can be set to 1 for convenience.

These metrics play a critical role in objectively evaluating the performance of traffic sign recognition systems and facilitating comparisons between different approaches. However, the choice among metrics may vary based on specific applications and requirements, and researchers should carefully select the most relevant metrics to assess the system’s effectiveness in real-world scenarios.

3.4. Comparing Metrics among Different Versions of YOLO

A comprehensive analysis was conducted to compare various metrics by reviewing selected research from 2016 to 2022 and computing the average values. This section introduces a series of scatter plots that assess the efficacy of traffic sign detection and recognition systems utilizing different versions of YOLO.

Figure 2a presents a scatter plot of FPS versus mAP, illustrating that YOLO’s FPS ranges from 25 to 85. This range indicates its suitability as a real-time object detector. Meanwhile, the mAP varies between 73% and 89%, with YOLOv5 emerging as the most efficient version. This suggests that, while YOLO is a competitive detector, there is still room for improvement.

A more detailed examination in Figure 2b reveals the relationship between ACC and mAP. Here, YOLOv5 demonstrates the lowest ACC but the highest mAP. Conversely, YOLOv2 shows the highest ACC with a comparatively lower mAP, partly due to the limited number of studies reporting ACC data for this version.

In another perspective, Figure 3a–c show the box plots of the mAP, ACC, and FPS metrics for the different versions of YOLO. YOLOv5 consistently outperforms in the mAP metric, while YOLOv2 lags behind. In terms of ACC, YOLOv1 exhibits the highest overall score, closely followed by YOLOv5. For the FPS metric, YOLOv5 again shows superior performance.

Figure 4a shows the average value of the mAP, ACC, precision, recall, and F1 score metrics, grouped across all YOLO versions. In all cases, there are high results, over 79%. The best precision and recall values correspond to YOLOv5, and the best F1 score value is from YOLOv3.

Then, Figure 4b shows in detail, for each of the YOLO versions, all the metrics. YOLOv4 presents the best ACC value, followed by YOLOv2. In the case of the mAP metric, YOLOv2 has the best performance. However, this is an isolated result because there is only one article in this case.

3.5. RQ4 [Hardware]: What Hardware Is Used to Implement Traffic Sign Recognition and Detection Systems Based on YOLO?

The implementation of traffic sign detection and recognition systems using the YOLO framework often necessitates hardware configurations that achieve a balance between computing capacity and efficiency. The selection and arrangement of hardware components may differ based on the specific application and its corresponding requirements. Graphics Processing Units (GPUs) play a crucial role in expediting the inference procedure of deep learning models such as YOLO. High-performance GPUs manufactured by prominent corporations such as Nvidia are frequently employed for the purpose of real-time or near-real-time inference. In certain instances, the precise GPUs employed may not be expressly stated or readily discernible based on the accessible information. This underscores the importance of thorough and precise reporting in both research and practical applications. The inclusion of detailed hardware specifications is essential for the replication of results and the effective implementation of these systems in diverse contexts.

Figure 5a offers a visual depiction of the prevalence of specific Nvidia GPU graphic cards within a range of research articles. Each bar within the plot corresponds to distinct Nvidia GPU models, and the height of each bar signifies the frequency with which each model is employed in these articles. These data provide valuable insights for individuals seeking to understand the prevalent usage of Nvidia GPUs in the realm of traffic sign detection and recognition.

Embedded cards like Jetson, Raspberry Pi, and Xilinx Zynq have emerged as crucial solutions in the field of traffic sign detection and recognition technologies. Embedded systems, which fall under the category of electronic devices, exhibit a remarkable equilibrium between energy efficiency and processing capabilities, rendering them indispensable elements in the realm of road safety and intelligent transportation applications. The compact nature and high computational capabilities of these devices allow for the implementation of sophisticated CV algorithms in real time. This enables accurate identification and classification of traffic lights, signage, and road conditions. Moreover, the inherent adaptability and integrated connection of these devices render them highly suitable for edge computing and industrial automation purposes, thereby playing a significant role in the progress of road safety and traffic optimization in urban as well as rural environments.

The visual representation in Figure 5b provides a graphical depiction of the prevalence of embedded GPU cards within the context of our study. Each bar in the graphical representation corresponds to a unique model of mobile/embedded GPU card, and its height is directly proportional to the frequency with which that specific GPU card model is mentioned in the papers analyzed in this study.

Jetson Xavier NX is identified as the most often utilized embedded system, as evidenced by several scholarly sources [63,68,77,135]. Following closely behind are Jetson Nano, which is also referenced in several academic articles [63,77,129], Jetson TX2 [41,79,140], and Jetson Xavier AGX, which is mentioned in at least one academic source [77]. These embedded systems are supported by Nvidia. Moreover, the utilization of Raspberry Pi [37,46,98] and Xilinx ZYNQ [18] is also observed within the scope of this investigation.

The fundamental differentiation between conventional GPU graphic cards and embedded systems resides in their physical structure, operational capabilities, and appropriateness for particular use cases. Conventional GPUs, which are primarily intended for producing high-performance graphics, possess considerable computational capabilities. However, they exhibit significant drawbacks, such as their bulky size, high power consumption, and limited adaptability in power-constrained or space-restricted settings. On the other hand, embedded systems, which are characterized by their compactness and energy efficiency, are specifically designed to excel in edge computing applications, such as the detection of traffic signs. These systems prioritize real-time processing capabilities and low power consumption as critical factors for optimal performance. Integrated sensors and numerous communication possibilities are frequently included in these devices, hence augmenting their use for many applications, such as intelligent transportation systems and road safety.

3.6. RQ5 [Challenges]: What Are the Challenges Encountered in the Detection and Recognition of Traffic Signs Using YOLO?

The examination carried out in the domain of traffic sign identification and recognition has shown seven noteworthy obstacles. These factors include variations in lighting settings, unfavorable weather conditions, partial occlusion, damage to signs, intricate scenarios, insufficient image quality, and concerns pertaining to geographical regions. In order to efficiently address and classify these challenges, they have been assigned the designations CH1 through CH7.

[CH1] Fluctuation in Lighting Conditions. The fluctuation in lighting conditions inside road surroundings poses a significant obstacle in the accurate detection and recognition of traffic signs. The issue at hand is a result of the fluctuating illumination levels experienced on roadways, which have a direct influence on the comprehensibility of traffic signs. In conditions of reduced illumination, signage may exhibit diminished visibility, hence augmenting the potential for accidents. Conversely, under conditions of extreme illumination, such as exposure to direct sunlight or the presence of headlights emitted by other cars, the phenomenon of glare may manifest, impeding the observer’s ability to accurately discern and interpret signage. The presence of lighting variability can give rise to unforeseeable shadows and contrasts on signage, obstructing crucial details and adding complexity to their identification. Automatic sign identification systems that rely on CV may encounter challenges when operating in environments with varying illumination conditions, particularly in the context of autonomous driving. In order to tackle these issues, researchers have created sophisticated technologies, including image enhancement algorithms, night vision systems, and adaptive image processing approaches. The field of sign technology has witnessed notable advancements, including the introduction of retroreflective signs that enhance their detectability in conditions of reduced illumination. The aforementioned solutions aim to alleviate the effects of lighting unpredictability on the safety of road users.

[CH2] Adverse Weather Conditions. One of the significant challenges in the field of traffic sign recognition and detection pertains to the mitigation of poor weather conditions. These conditions include heavy rainfall, dense fog, frozen precipitation in the form of snowfall, and dust storms, among several other weather phenomena. These atmospheric occurrences result in reduced visibility, which consequently affects the accuracy of images acquired by the sensors incorporated inside these devices. As a result, the effectiveness of detecting and recognizing traffic signs is degraded, which has significant ramifications for road safety and the operational efficiency of autonomous driving systems. Therefore, it is crucial to address this difficulty and develop robust procedures that can operate well in all weather situations.

[CH3] Partial Occlusion. The issue of partial occlusion in the field of traffic sign recognition and detection pertains to the circumstance in which a notable portion of a traffic sign becomes occluded or concealed by other items within the scene. The presence of parked vehicles, trees, buildings, or other features in the road environment can contribute to this phenomenon. The presence of partial occlusion poses a significant obstacle for CV systems that are responsible for accurately detecting and categorizing traffic signs. This is due to the fact that the available visual data may be inadequate or corrupted. The resolution of this issue necessitates the creation of algorithms and methodologies with the ability to identify and categorize traffic signs, even in situations where they are partially obscured. This is of utmost importance in enhancing both road safety and the efficacy of driver assistance systems.

[CH4] Sign Damage. The issue of sign damage within the domain of traffic sign recognition and detection pertains to the physical degradation or deterioration of road signs. This degradation can impede the accurate identification and categorization of signs by automated systems. These phenomena can be attributed to a range of circumstances, encompassing severe weather conditions, acts of vandalism, vehicular crashes, and other manifestations of natural deterioration. The existence of signs that are damaged or illegible is a notable obstacle for algorithms designed to recognize traffic signs as their capacity to comprehend and analyze the information conveyed by the sign is undermined. Hence, it is imperative to address the issue of sign degradation in order to guarantee the efficiency and dependability of driving assistance and road safety systems that rely on automatic traffic sign identification.

[CH5] Complex Scenarios. The examination of traffic sign recognition and detection in intricate settings poses considerable difficulties owing to the existence of numerous signs with backgrounds that are challenging to distinguish, as well as instances where multiple signs may seem like a unified entity. The intricate nature of these circumstances presents a significant challenge for CV systems designed to automatically read traffic signs in dynamic and diverse settings. The presence of many traffic signs within the visual range of a detection system poses a significant obstacle. In scenarios characterized by heavy traffic or intricate crossings, it is common for multiple lights to overlap or be situated in close proximity to one another. The presence of this overlap has the potential to introduce confusion to the detection system, leading to the misidentification of signs or the inadvertent deletion of certain signs. Moreover, intricate backgrounds, such as densely vegetated areas or the existence of items within the surrounding environment, might provide visual interference that poses challenges in effectively distinguishing the sign from the background. Another significant problem pertains to the system’s capacity to differentiate between various signs that could initially appear as a unified entity. This phenomenon may occur as a result of physical overlap or the simultaneous presence of signs exhibiting identical form and color patterns. An erroneous understanding of sign merging can lead to significant ramifications in relation to both the safety of road users and the overall efficiency of traffic flow.

[CH6] Inadequate Image Quality. The issue pertaining to inadequate image quality poses a substantial constraint within the domain of traffic sign recognition and detection. This difficulty pertains to the existence of photographs that exhibit insufficient resolution, noise, or visual deterioration, hence impeding the accurate identification and categorization of traffic signs. The presence of substandard photos can be attributed to a range of circumstances, including unfavorable lighting conditions, obstructions, significant distances between the camera and the subject, or inherent constraints of the employed capture technologies. The presence of adverse conditions has the potential to undermine the accuracy and dependability of recognition algorithms. This highlights the significance of developing resilient strategies that can effectively handle such images within the domain of traffic sign recognition and detection.

[CH7] Geographic Regions. The study of traffic sign recognition and detection is influenced by various complexities that are inherent to different geographic regions. The intricacies arise from the variety of approaches to road infrastructure design and development, as well as the cultural disparities deeply rooted in various communities.

One major challenge is the presence of disparities in the placement and design of traffic signs. Various geographic regions may adopt distinct strategies when it comes to the placement, dimensions, and color schemes employed in signs. The extent of this heterogeneity might span from minor variations in typographic elements to significant alterations in the emblematic icons. Hence, it is crucial that recognition systems possess the necessary flexibility to handle such variability and effectively read a wide range of indicators.

The existence of divergent traffic laws and regulations across different locations presents significant issues. There can be variations in traffic regulations and signaling rules, which can have a direct impact on the interpretation and enforcement of traffic signs. Hence, it is imperative for recognition systems to possess the capability of integrating a wide range of regulations and to be developed in a manner that can adapt to requirements specific to different regions.

Table 3 presents a full summary of each challenge, along with the related articles that advocate for them. After conducting a comprehensive analysis, a range of solutions employed by diverse writers have been identified. The solutions can be categorized into three primary groups: SOL1 studies, which actively propose solutions to the identified problems; SOL2 studies, which acknowledge the issues but do not provide specific remedies; and SOL3 studies, which provide information about datasets without directly addressing the challenges at hand.

From Table 3, it can seen that challenges Ch1, Ch2, and Ch2 are the most addressed with direct and specific solutions. On the contrary, challenges Ch2 to Ch7 are not addressed by any researcher.

3.7. Discussion

This study presents an SLR on the utilization of YOLO technology for the purpose of object detection within the domain of traffic sign detection and recognition. This review offers a detailed examination of prospective applications, datasets, essential metrics, hardware considerations, and challenges associated with this technology. This section provides an analysis of the primary discoveries and implications of the SLR.

The measurement of FPS holds significant importance, particularly in dynamic settings where prompt detection plays a vital role in alerting drivers to potential hazards or infractions. It is imperative to strike a balance between accuracy and speed in real-time applications.

The accessibility and caliber of datasets play a crucial role in the training and assessment of object detection algorithms. The review elucidates the difficulties encountered in the process of data normalization, the discrepancies in data quality across different authors, and the insufficient amount of information available for certain datasets. The TTK1000 China dataset is notable for its comprehensive representation of traffic signs, categories, and authentic driving scenarios. The USA BDD100K dataset provides a comprehensive compilation of traffic signs, encompassing a diverse array of lighting situations. Nevertheless, the analysis of these datasets necessitates substantial computational resources.

From a pragmatic perspective, the German databases, although moderately sized, are extensively employed. These datasets provide a diverse range of traffic signs and categories, hence enhancing their suitability for experimental purposes.

In the realm of hardware considerations, the fundamental importance lies in the utilization of efficient algorithms that exhibit lower energy consumption and robustness in contexts with limited memory capacity. It is advisable to minimize redundancy in the quantity of layers inside neural networks. The effective dissipation of heat is of utmost importance in mitigating the risk of overheating, particularly in the context of mobile graphics processing unit (GPU) systems. It is worth mentioning that NVIDIA presently holds a dominant position in the market for mobile GPU systems.

Examining the challenges, this review emphasizes the necessity of conducting research that focuses on sign degradation, which is a crucial but frequently disregarded element in the field of traffic sign detection and recognition. The failure to consider the presence of damaged signs may result in the generation of incorrect detections and the probable occurrence of hazardous circumstances on the roadway. Tackling this difficulty not only improves the resilience of algorithms but also contributes to the broader goal of developing more reliable traffic sign recognition systems.

Another notable difficulty that has been emphasized is the lack of research studies that specifically examine the geographical disparities in the detection and recognition of traffic signs. Geographical variations can introduce unique complexities, including diverse traffic sign designs, language differences, and local regulations. Failing to account for these nuances can severely limit the effectiveness and applicability of traffic sign detection systems across diverse regions. This observation underscores the pressing need for targeted research efforts in this area. By addressing this challenge, we not only enhance the robustness and adaptability of our detection systems but also contribute to safer and more reliable transportation networks on a global scale.

Lastly, the SLR emphasizes the broad applications of YOLO technology in traffic-related contexts. It serves as a pivotal tool in preventing traffic accidents by providing early warnings of potential issues during driving. Additionally, it can be integrated with GPS systems for infrastructure maintenance, allowing for real-time assessment of road sign quality. Moreover, YOLO technology plays a foundational role in the development of autonomous vehicles.

3.8. Possible Threats to SLR Validation

This section will discuss the threats that may affect this SLR, and how they have been resolved.

3.9. Construct Validity Threat

During our comprehensive investigation of research papers centered on YOLO for the detection and recognition of traffic signs, we recognized a number of possible challenges to the soundness of our findings, mostly emanating from the presumptions made throughout the evaluation procedure. In order to ensure the trustworthiness and appropriate interpretation of the presented findings, it is crucial to acknowledge and tackle these constraints.

3.9.1. The Influence of Version Bias on Metrics

Threat: The potential bias in the metrics arises from the inclusion of different versions of YOLO in the collected research, with the assumption that newer versions of YOLO are intrinsically more efficient.

Mitigation: This review acknowledges the variations in YOLO versions but assumes that the metrics presented in various research may be compared, regardless of the specific YOLO version used. Nevertheless, it is imperative to recognize the potential impact of differences in versions on the documented performance measures.

3.9.2. Geographic Diversity in Sign Datasets

Threat: The inclusion of traffic sign datasets from various geographical locations may generate unpredictability in the reported performance of the YOLO algorithm, thereby affecting the applicability of the findings.

Mitigation: Despite acknowledging the constraint of dataset origin diversity, we make the assumption of comparability in performance indicators across various research. Nevertheless, it is important to acknowledge the potential influence of geographical variations on the performance of YOLO, hence underscoring the necessity for careful interpretation.

3.9.3. Hardware Discrepancies

Threat: The presence of disparities in hardware utilization among studies may result in variations in the time performance of the detection and recognition systems. This assumption is based on the idea that newer versions of YOLO use greater hardware resources and that more advanced hardware leads to quicker response times.

Mitigation: The underlying premise of our study is that the comparability of results holds true across experiments conducted with varying hardware setups. However, it is important to identify and take into consideration the potential impact of hardware changes on the reported time performance in this assessment.

3.9.4. Limitations of Statistical Metrics

Threat: Relying on a single statistical summary metric to measure YOLO’s capability for traffic sign detection may not fully capture the system’s performance.

Mitigation: Despite this limitation, our main premise argues that the selected metrics enable the comparability of various YOLO systems and adequately assess their overall effectiveness. Acknowledging the inherent limitations of statistical summaries is imperative when attempting to conduct a thorough assessment of detection and identification systems.

Our goal is to improve the clarity and reliability of our review’s interpretation by openly acknowledging these potential threats to construct validity. This provides a more nuanced understanding of the limitations associated with the diverse aspects of YOLO techniques for traffic sign detection and recognition.

3.9.5. Threats to Internal Validation

The greatest threat that this SLR has faced is the number of studies collected. A total of 115 articles have been analyzed, selected in the period from 2016 to 2022, which are in the category of “Journal” or “Conferences”. On the other hand, there are articles about to be published in 2023, which have not yet been analyzed.

3.9.6. Threats to External Validation

In this study, only the data contained in the previously selected articles have been collected and analyzed; that is, no experimentation has been carried out with traffic sign detection, recognition, and identification systems, so our own experimental data have not been compared with those provided by the different researchers.

3.9.7. Threats to the Validation of the Conclusions

The documents studied have been compiled from nine world-renowned publishers, but valuable information has only been available from seven of them. Likewise, works published in other publishers (ACM), in pre-prints (arXiv), in patent bases (EPO, Google Patents), or technology companies (VisLab, Continental) have been omitted. Consequently, no literature has been collected from these sources. Obviously, ignoring this literature leads to the notion that sign detection and recognition systems traffic may be more advanced than presented in this study. Therefore, conclusions based on 115 studies may not adequately present the progress of detection, recognition, and identification systems of traffic signs using YOLO in the period 2016 to 2022.

3.10. Future Research Directions

This study suggests that, despite the potential benefits of traffic sign detection and recognition systems as a technology with the potential to reduce traffic accidents and contribute to the development of autonomous vehicles, specific barriers such as extreme weather conditions, lack of standardized databases, and insufficient criteria for choosing the object detector are the main problems that need to be addressed. Thus, the following questions arise:

What is the performance of traffic sign detection and recognition systems under extreme weather conditions?
How could datasets for the development of traffic sign detection and recognition systems be standardized?
What is the best object detector to be used in traffic sign detection and recognition systems?

Future research will be initiated based on the answers to these questions, which will include new versions of YOLO, e.g., v6, v7, v8, and NAS.

4. Conclusions

This study presents a comprehensive and up-to-date analysis of the utilization of the YOLO object detection algorithm in the field of traffic sign detection and recognition for the period from 2016 to 2022. The review also offers an analysis of applications, sign datasets, effectiveness assessment measures, hardware technologies, challenges faced in this field, and their potential solutions.

The results emphasize the extensive use of the YOLO object detection algorithm as a popular method for detecting and identifying traffic indicators, particularly in fields such as vehicular security and intelligent autonomous vehicles.

The accessibility and usefulness of the YOLO object detection algorithm are underscored by the existence of publicly accessible datasets that may be used for training, testing, and validating YOLO-based systems.

In addition, the analysis is enhanced by considering the prevailing hardware options, specifically the widespread utilization of Nvidia RTX2080 and Testa V100 for data processing, and Jetson Javier NX for embedded applications, which adds a practical aspect to the evaluation.

However, a notable deficiency in the existing body of research pertains to the examination of geographical discrepancies and environmental factors, along with the difficulties presented by deteriorated, obscured, blurred, or unclear traffic signs.

The acquisition of this knowledge bears substantial importance for researchers and practitioners undertaking projects that involve the implementation of the YOLO object detection algorithm in the fields of traffic sign detection and recognition.

Author Contributions

M.F.-C.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, validation, visualization, writing—original draft, project administration, supervision, and writing—review and editing; D.G., J.M., B.S.L., B.D. and J.S.A.: data curation and formal analysis; C.A.A.: methodology, supervision, and writing—review and editing; D.Z.-B.: supervision and writing—review; J.M.A.M.: supervision and funding acquisition. All authors gave final approval for publication and agreed to be held accountable for the work performed herein. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding, and the APC was funded by grants PID2019-104793RB-C31, PdC2022-133684-C31, and PDC2021-1215-17-C31, funded by MCIN/AEI/10.13039/501100011033.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors wish to offer thanks to J. Yunda, from UCLouvain, and M. Pilla, from CSIC, for their help in compiling the information, and to “the open source community” for all the contributions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

WHO	World Health Organization
RTA	Road Traffic Accidents
ADAS	Advanced Driver Assistance Systems
DL	Deep Learning
CV	Computer Vision
YOLO	You Look Only Once
GPU	Graphics Processing Unit
SLR	Systematic Literature Review
RQ	Research question
FPS	Frames per second
ACC	Accuracy
AP	Average Precision
mAP	Mean Average Precision
F1	F1 Score
IoU	Intersection over Union
WoS	Web of Science

References

Yu, G.; Wong, P.K.; Zhao, J.; Mei, X.; Lin, C.; Xie, Z. Design of an Acceleration Redistribution Cooperative Strategy for Collision Avoidance System Based on Dynamic Weighted Multi-Objective Model Predictive Controller. IEEE Trans. Intell. Transp. Syst. 2022, 23, 5006–5018. [Google Scholar] [CrossRef]
World Health Organization (WHO). Global Status Report on Road Safety 2018. Available online: https://www.who.int/publications/i/item/WHO-NMH-NVI-18.20/ (accessed on 1 September 2022).
Statista. Road Accidents in the United States-Statistics & Facts. Available online: https://www.statista.com/topics/3708/road-accidents-in-the-us/#dossierKeyfigures (accessed on 2 September 2022).
European Parliament. Road Fatality Statistics in the EU. Available online: https://www.europarl.europa.eu/news/en/headlines/society/20190410STO36615/road-fatality-statistics-in-the-eu-infographic (accessed on 2 September 2022).
Liu, C.; Li, S.; Chang, F.; Wang, Y. Machine Vision Based Traffic Sign Detection Methods: Review, Analyses and Perspectives. IEEE Access 2019, 7, 86578–86596. [Google Scholar] [CrossRef]
Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of YOLO algorithm developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar] [CrossRef]
Rajendran, S.P.; Shine, L.; Pradeep, R.; Vijayaraghavan, S. Fast and accurate traffic sign recognition for self driving cars using retinanet based detector. In Proceedings of the 2019 International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 17–19 July 2019; pp. 784–790. [Google Scholar]
Yang, W.; Zhang, W. Real-time traffic signs detection based on YOLO network model. In Proceedings of the 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Chongqing, China, 29–30 October 2020; pp. 354–357. [Google Scholar]
Dewi, C.; Chen, R.C.; Liu, Y.T.; Jiang, X.; Hartomo, K.D. YOLO V4 for advanced traffic sign recognition with synthetic training data generated by various GAN. IEEE Access 2021, 9, 97228–97242. [Google Scholar] [CrossRef]
Mijić, D.; Brisinello, M.; Vranješ, M.; Grbić, R. Traffic Sign Detection Using YOLOv3. In Proceedings of the 2020 IEEE 10th International Conference on Consumer Electronics (ICCE-Berlin), Berlin, Germany, 9–11 November 2020; pp. 1–6. [Google Scholar]
Mohd-Isa, W.N.; Abdullah, M.S.; Sarzil, M.; Abdullah, J.; Ali, A.; Hashim, N. Detection of Malaysian Traffic Signs via Modified YOLOv3 Algorithm. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020; pp. 1–5. [Google Scholar]
Avramović, A.; Sluga, D.; Tabernik, D.; Skočaj, D.; Stojnić, V.; Ilc, N. Neural-network-based traffic sign detection and recognition in high-definition images using region focusing and parallelization. IEEE Access 2020, 8, 189855–189868. [Google Scholar] [CrossRef]
Gu, Y.; Si, B. A novel lightweight real-time traffic sign detection integration framework based on YOLOv4. Entropy 2022, 24, 487. [Google Scholar] [CrossRef] [PubMed]
Expertos en Siniestros. Significado de las Señales de Tráfico en España. Available online: http://www.expertoensiniestros.es/significado-senales-de-trafico-en-espana/ (accessed on 27 December 2019).
Gupta, A.; Anpalagan, A.; Guan, L.; Khwaja, A.S. Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues. Array 2021, 10, 100057. [Google Scholar] [CrossRef]
Diwan, T.; Anirudh, G.; Tembhurne, J.V. Object detection using YOLO: Challenges, architectural successors, datasets and applications. Multimed. Tools Appl. 2022, 82, 9243–9275. [Google Scholar] [CrossRef]
Boukerche, A.; Hou, Z. Object Detection Using Deep Learning Methods in Traffic Scenarios. ACM Comput. Surv. 2022, 54, 1–35. [Google Scholar] [CrossRef]
Ayachi, R.; Afif, M.; Said, Y.; Ben Abdelali, A. An edge implementation of a traffic sign detection system for Advanced Driver Assistance Systems. Int. J. Intell. Robot. Appl. 2022, 6, 207–215. [Google Scholar] [CrossRef]
Gámez-Serna, C.; Ruichek, Y. Classification of Traffic Signs: The European Dataset. IEEE Access 2018, 6, 78136–78148. [Google Scholar] [CrossRef]
Almuhajri, M.; Suen, C. Intensive Survey About Road Traffic Signs Preprocessing, Detection and Recognition. In Advances in Data Science, Cyber Security and IT Applications; Alfaries, A., Mengash, H., Yasar, A., Shakshuki, E., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; pp. 275–289. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Liu, C.; Tao, Y.; Liang, J.; Li, K.; Chen, Y. Object detection based on YOLO network. In Proceedings of the 2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 14–16 December 2018; pp. 799–803. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Redmon, J. Darknet: Open Source Neural Networks in C. Available online: https://pjreddie.com/darknet/ (accessed on 15 May 2023).
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Wang, C.Y. Scaled-YOLOv4: Scaling Cross Stage Partial Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
Kasper-Eulaers, M.; Hahn, N.; Berger, S.; Sebulonsen, T.; Myrland, Ø.; Kummervold, P.E. Detecting heavy goods vehicles in rest areas in winter conditions using YOLOv5. Algorithms 2021, 14, 114. [Google Scholar] [CrossRef]
Wali, S.B.; Abdullah, M.A.; Hannan, M.A.; Hussain, A.; Samad, S.A.; Ker, P.J.; Mansor, M.B. Vision-Based Traffic Sign Detection and Recognition Systems: Current Trends and Challenges. Sensors 2019, 19, 2093. [Google Scholar] [CrossRef]
Borrego-Carazo, J.; Castells-Rufas, D.; Biempica, E.; Carrabina, J. Resource-Constrained Machine Learning for ADAS: A Systematic Review. IEEE Access 2020, 8, 40573–40598. [Google Scholar] [CrossRef]
Muhammad, K.; Ullah, A.; Lloret, J.; Del Ser, J.; de Albuquerque, V.H.C. Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4316–4336. [Google Scholar] [CrossRef]
Higgins, J.; Thomas, J.; Chandler, J.; Cumpston, M.; Li, T.; Page, M.; Welch, V. Cochrane Handbook for Systematic Reviews of Interventions Version 6.3 (Updated February 2022); Technical Report; Cochrane: London, UK, 2022. [Google Scholar]
Cardoso Ermel, A.P.; Lacerda, D.P.; Morandi, M.I.W.M.; Gauss, L. Literature Reviews; Springer International Publishing: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
Mengist, W.; Soromessa, T.; Legese, G. Method for conducting systematic literature review and meta-analysis for environmental science research. MethodsX 2020, 7, 100777. [Google Scholar] [CrossRef]
Tarachandy, S.M.; Aravinth, J. Enhanced Local Features using Ridgelet Filters for Traffic Sign Detection and Recognition. In Proceedings of the 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 4–6 August 2021; pp. 1150–1156. [Google Scholar]
Yan, W.; Yang, G.; Zhang, W.; Liu, L. Traffic Sign Recognition using YOLOv4. In Proceedings of the 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 15–17 April 2022; pp. 909–913. [Google Scholar]
Kankaria, R.V.; Jain, S.K.; Bide, P.; Kothari, A.; Agarwal, H. Alert system for drivers based on traffic signs, lights and pedestrian detection. In Proceedings of the 2020 International Conference for Emerging Technology (INCET), Belgaum, India, 5–7 June 2020; pp. 1–5. [Google Scholar]
Ren, X.; Zhang, W.; Wu, M.; Li, C.; Wang, X. Meta-YOLO: Meta-Learning for Few-Shot Traffic Sign Detection via Decoupling Dependencies. Appl. Sci. 2022, 12, 5543. [Google Scholar] [CrossRef]
Liu, Y.; Shi, G.; Li, Y.; Zhao, Z. M-YOLO: Traffic sign detection algorithm applicable to complex scenarios. Symmetry 2022, 14, 952. [Google Scholar] [CrossRef]
Li, M.; Zhang, L.; Li, L.; Song, W. YOLO-based traffic sign recognition algorithm. Comput. Intell. Neurosci. 2022, 2022, 2682921. [Google Scholar] [CrossRef] [PubMed]
Yao, Z.; Song, X.; Zhao, L.; Yin, Y. Real-time method for traffic sign detection and recognition based on YOLOv3-tiny with multiscale feature extraction. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 235, 1978–1991. [Google Scholar] [CrossRef]
Vasanthi, P.; Mohan, L. YOLOV5 and Morphological Hat Transform Based Traffic Sign Recognition. In Rising Threats in Expert Applications and Solutions; Springer: Berlin/Heidelberg, Germany, 2022; pp. 579–589. [Google Scholar]
Rodríguez, R.C.; Carlos, C.M.; Villegas, O.O.V.; Sánchez, V.G.C.; Domínguez, H.d.J.O. Mexican traffic sign detection and classification using deep learning. Expert Syst. Appl. 2022, 202, 117247. [Google Scholar] [CrossRef]
Nandavar, S.; Kaye, S.A.; Senserrick, T.; Oviedo-Trespalacios, O. Exploring the factors influencing acquisition and learning experiences of cars fitted with advanced driver assistance systems (ADAS). Transp. Res. Part F Traffic Psychol. Behav. 2023, 94, 341–352. [Google Scholar] [CrossRef]
Yang, T.; Tong, C. Small Traffic Sign Detector in Real-time Based on Improved YOLO-v4. In Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Haikou, China, 20–22 December 2021; pp. 1318–1324. [Google Scholar]
William, M.M.; Zaki, P.S.; Soliman, B.K.; Alexsan, K.G.; Mansour, M.; El-Moursy, M.; Khalil, K. Traffic signs detection and recognition system using deep learning. In Proceedings of the 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 8–10 December 2019; pp. 160–166. [Google Scholar]
Hsieh, M.H.; Zhao, Q. A Study on Two-Stage Approach for Traffic Sign Recognition: Few-to-Many or Many-to-Many? In Proceedings of the 2020 International Conference on Machine Learning and Cybernetics (ICMLC), Adelaide, Australia, 2 December 2020; pp. 76–81. [Google Scholar]
Gopal, R.; Kuinthodu, S.; Balamurugan, M.; Atique, M. Tiny Object Detection: Comparative Study using Single Stage CNN Object Detectors. In Proceedings of the 2019 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Bangalore, India, 15–16 November 2019; pp. 1–3. [Google Scholar]
Kavya, R.; Hussain, K.M.Z.; Nayana, N.; Savanur, S.S.; Arpitha, M.; Srikantaswamy, R. Lane Detection and Traffic Sign Recognition from Continuous Driving Scenes using Deep Neural Networks. In Proceedings of the 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 7–9 December 2021; pp. 1461–1467. [Google Scholar]
Rani, S.J.; Eswaran, S.U.; Mukund, A.V.; Vidul, M. Driver Assistant System using YOLO V3 and VGGNET. In Proceedings of the 2022 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 20–22 July 2022; pp. 372–376. [Google Scholar]
Zhang, H.; Qin, L.; Li, J.; Guo, Y.; Zhou, Y.; Zhang, J.; Xu, Z. Real-time detection method for small traffic signs based on YOLOv3. IEEE Access 2020, 8, 64145–64156. [Google Scholar] [CrossRef]
Garg, P.; Chowdhury, D.R.; More, V.N. Traffic sign recognition and classification using YOLOv2, Faster R-CNN and SSD. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 6–8 July 2019; pp. 1–5. [Google Scholar]
Devyatkin, A.; Filatov, D. Neural network traffic signs detection system development. In Proceedings of the 2019 XXII International Conference on Soft Computing and Measurements (SCM)), St. Petersburg, Russia, 23–25 May 2019; pp. 125–128. [Google Scholar]
Ruta, A.; Li, Y.; Liu, X. Real-time traffic sign recognition from video by class-specific discriminative features. Pattern Recognit. 2010, 43, 416–430. [Google Scholar] [CrossRef]
Novak, B.; Ilić, V.; Pavković, B. YOLOv3 Algorithm with additional convolutional neural network trained for traffic sign recognition. In Proceedings of the 2020 Zooming Innovation in Consumer Technologies Conference (ZINC), Novi Sad, Serbia, 26–27 May 2020; pp. 165–168. [Google Scholar]
Arief, R.W.; Nurtanio, I.; Samman, F.A. Traffic Signs Detection and Recognition System Using the YOLOv4 Algorithm. In Proceedings of the 2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), Bandung, Indonesia, 28–30 April 2021; pp. 1–6. [Google Scholar]
Manocha, P.; Kumar, A.; Khan, J.A.; Shin, H. Korean traffic sign detection using deep learning. In Proceedings of the 2018 International SoC Design Conference (ISOCC), Daegu, Republic of Korea, 12–15 November 2018; pp. 247–248. [Google Scholar]
Rajendran, S.P.; Shine, L.; Pradeep, R.; Vijayaraghavan, S. Real-time traffic sign recognition using YOLOv3 based detector. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 6–8 July 2019; pp. 1–7. [Google Scholar]
Khan, J.A.; Yeo, D.; Shin, H. New dark area sensitive tone mapping for deep learning based traffic sign recognition. Sensors 2018, 18, 3776. [Google Scholar] [CrossRef]
Zhang, J.; Huang, M.; Jin, X.; Li, X. A real-time chinese traffic sign detection algorithm based on modified YOLOv2. Algorithms 2017, 10, 127. [Google Scholar] [CrossRef]
Seraj, M.; Rosales-Castellanos, A.; Shalkamy, A.; El-Basyouny, K.; Qiu, T.Z. The implications of weather and reflectivity variations on automatic traffic sign recognition performance. J. Adv. Transp. 2021, 2021, 5513552. [Google Scholar] [CrossRef]
Du, L.; Ji, J.; Pei, Z.; Zheng, H.; Fu, S.; Kong, H.; Chen, W. Improved detection method for traffic signs in real scenes applied in intelligent and connected vehicles. IET Intell. Transp. Syst. 2020, 14, 1555–1564. [Google Scholar] [CrossRef]
Ren, K.; Huang, L.; Fan, C.; Han, H.; Deng, H. Real-time traffic sign detection network using DS-DetNet and lite fusion FPN. J. Real-Time Image Process. 2021, 18, 2181–2191. [Google Scholar] [CrossRef]
Khan, J.A.; Chen, Y.; Rehman, Y.; Shin, H. Performance enhancement techniques for traffic sign recognition using a deep neural network. Multimed. Tools Appl. 2020, 79, 20545–20560. [Google Scholar] [CrossRef]
Zhang, J.; Ye1, Z.; Jin, X.; Wang, J.; Zhang, J. Real-time trafc sign detection based on multiscale attention and spatial information aggregator. J. Real-Time Image Process. 2022, 19, 1155–1167. [Google Scholar] [CrossRef]
Karthika, R.; Parameswaran, L. A Novel Convolutional Neural Network Based Architecture for Object Detection and Recognition with an Application to Traffic Sign Recognition from Road Scenes. Pattern Recognit. Image Anal. 2022, 32, 351–362. [Google Scholar] [CrossRef]
Chen, J.; Jia, K.; Chen, W.; Lv, Z.; Zhang, R. A real-time and high-precision method for small traffic-signs recognition. Neural Comput. Appl. 2022, 34, 2233–2245. [Google Scholar] [CrossRef]
Wenze, Y.; Quan, L.; Zicheng, Z.; Jinjing, H.; Hansong, W.; Zhihui, F.; Neng, X.; Chuanbo, H.; Chang, K.C. An Improved Lightweight Traffic Sign Detection Model for Embedded Devices. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, Cairo, Egypt, 11–13 December 2021; pp. 154–164. [Google Scholar]
Lima, A.A.; Kabir, M.; Das, S.C.; Hasan, M.; Mridha, M. Road Sign Detection Using Variants of YOLO and R-CNN: An Analysis from the Perspective of Bangladesh. In Proceedings of the International Conference on Big Data, IoT, and Machine Learning, Cairo, Egypt, 20–22 November 2022; pp. 555–565. [Google Scholar]
Qin, Z.; Yan, W.Q. Traffic-sign recognition using deep learning. In Proceedings of the International Symposium on Geometry and Vision, Auckland, New Zealand, 28–29 January 2021; pp. 13–25. [Google Scholar]
Xing, J.; Yan, W.Q. Traffic sign recognition using guided image filtering. In Proceedings of the International Symposium on Geometry and Vision, Auckland, New Zealand, 28–29 January 2021; pp. 85–99. [Google Scholar]
Hussain, Z.; Md, K.; Kattigenahally, K.N.; Nikitha, S.; Jena, P.P.; Harshalatha, Y. Traffic Symbol Detection and Recognition System. In Emerging Research in Computing, Information, Communication and Applications; Springer: Berlin/Heidelberg, Germany, 2022; pp. 885–897. [Google Scholar]
Chniti, H.; Mahfoudh, M. Designing a Model of Driving Scenarios for Autonomous Vehicles. In Proceedings of the International Conference on Knowledge Science, Engineering and Management, Singapore, 6–8 August 2022; pp. 396–405. [Google Scholar]
Dewi, C.; Chen, R.C.; Yu, H. Weight analysis for various prohibitory sign detection and recognition using deep learning. Multimed. Tools Appl. 2020, 79, 32897–32915. [Google Scholar] [CrossRef]
Dewi, C.; Chen, R.C.; Jiang, X.; Yu, H. Deep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4. Multimed. Tools Appl. 2022, 81, 37821–37845. [Google Scholar] [CrossRef]
Tran, A.C.; Dien, D.L.; Huynh, H.X.; Long, N.H.V.; Tran, N.C. A Model for Real-Time Traffic Signs Recognition Based on the YOLO Algorithm–A Case Study Using Vietnamese Traffic Signs. In Proceedings of the International Conference on Future Data and Security Engineering, Nha Trang City, Vietnam, 27–29 November 2019; pp. 104–116. [Google Scholar]
Güney, E.; Bayilmiş, C.; Çakan, B. An implementation of real-time traffic signs and road objects detection based on mobile GPU platforms. IEEE Access 2022, 10, 86191–86203. [Google Scholar] [CrossRef]
Saouli, A.; Margae, S.E.; Aroussi, M.E.; Fakhri, Y. Real-Time Traffic Sign Recognition on Sipeed Maix AI Edge Computing. In Proceedings of the International Conference on Advanced Intelligent Systems for Sustainable Development, Tangier, Morocco, 21–26 December 2020; pp. 517–528. [Google Scholar]
Satılmış, Y.; Tufan, F.; Şara, M.; Karslı, M.; Eken, S.; Sayar, A. CNN based traffic sign recognition for mini autonomous vehicles. In Proceedings of the International Conference on Information Systems Architecture and Technology, Szklarska Poreba, Poland, 17–19 September 2018; pp. 85–94. [Google Scholar]
Wu, Y.; Li, Z.; Chen, Y.; Nai, K.; Yuan, J. Real-time traffic sign detection and classification towards real traffic scene. Multimed. Tools Appl. 2020, 79, 18201–18219. [Google Scholar] [CrossRef]
Li, S.; Cheng, X.; Zhou, Z.; Zhao, B.; Li, S.; Zhou, J. Multi-scale traffic sign detection algorithm based on improved YOLO V4. In Proceedings of the 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 18 September–12 October 2022; pp. 8–12. [Google Scholar]
Youssouf, N. Traffic sign classification using CNN and detection using faster-RCNN and YOLOV4. Heliyon 2022, 8, e11792. [Google Scholar] [CrossRef]
Zou, H.; Zhan, H.; Zhang, L. Neural Network Based on Multi-Scale Saliency Fusion for Traffic Signs Detection. Sustainability 2022, 14, 16491. [Google Scholar] [CrossRef]
Yang, Z. Intelligent Recognition of Traffic Signs Based on Improved YOLO v3 Algorithm. Mob. Inf. Syst. 2022, 2022, 11. [Google Scholar] [CrossRef]
Zhao, W.; Danjing, L. Traffic sign detection research based on G- GhostNet-YOLOx model. In Proceedings of the IEEE 4th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 12–14 October 2022. [Google Scholar]
Bai, W.; Zhao, J.; Dai, C.; Zhang, H.; Zhao, L.; Ji, Z.; Ganchev, I. Two Novel Models for Traffic Sign Detection Based on YOLOv5s. Axioms 2023, 12, 160. [Google Scholar] [CrossRef]
Jiang, J.; Yang, J.; Yin, J. Traffic sign target detection method based on deep learning. In Proceedings of the 2021 International Conference on Computer Information Science and Artificial Intelligence (CISAI), Kunming, China, 17–19 September 2021; pp. 74–78. [Google Scholar]
Kumagai, K.; Goto, T. Improving Accuracy of Traffic Sign Detection Using Learning Method. In Proceedings of the 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech), Osaka, Japan, 7–9 March 2022; pp. 318–319. [Google Scholar]
Yu, J.; Ye, X.; Tu, Q. Traffic Sign Detection and Recognition in Multiimages Using a Fusion Model With YOLO and VGG Network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 16632–16642. [Google Scholar] [CrossRef]
Ćorović, A.; Ilić, V.; Ðurić, S.; Marijan, M.; Pavković, B. The real-time detection of traffic participants using YOLO algorithm. In Proceedings of the 2018 26th Telecommunications Forum (TELFOR), Belgrade, Serbia, 20–21 November 2018; pp. 1–4. [Google Scholar]
Kong, S.; Park, J.; Lee, S.S.; Jang, S.J. Lightweight traffic sign recognition algorithm based on cascaded CNN. In Proceedings of the 2019 19th International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 15–18 October 2019; pp. 506–509. [Google Scholar]
Song, W.; Suandi, S.A. TSR-YOLO: A Chinese Traffic Sign Recognition Algorithm for Intelligent Vehicles in Complex Scenes. Sensors 2023, 23, 749. [Google Scholar] [CrossRef]
Kumar, V.A.; Raghuraman, M.; Kumar, A.; Rashid, M.; Hakak, S.; Reddy, M.P.K. Green-Tech CAV: NGreen-Tech CAV: Next Generation Computing for Traffic Sign and Obstacle Detection in Connected and Autonomous Vehicles. IEEE Trans. Green Commun. Netw. 2022, 6, 1307–1315. [Google Scholar] [CrossRef]
Liu, X.; Jiang, X.; Hu, H.; Ding, R.; Li, H.; Da, C. Traffic Sign Recognition Algorithm Based on Improved YOLOv5s. In Proceedings of the 2021 International Conference on Control, Automation and Information Sciences (ICCAIS), Xi’an, China, 14–17 October 2021; pp. 980–985. [Google Scholar]
Wang, H.; Yu, H. Traffic sign detection algorithm based on improved YOLOv4. In Proceedings of the 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 11–13 December 2020; Volume 9, pp. 1946–1950. [Google Scholar]
Rehman, Y.; Amanullah, H.; Shirazi, M.A.; Kim, M.Y. Small Traffic Sign Detection in Big Images: Searching Needle in a Hay. IEEE Access 2022, 10, 18667–18680. [Google Scholar] [CrossRef]
Alhabshee, S.M.; bin Shamsudin, A.U. Deep learning traffic sign recognition in autonomous vehicle. In Proceedings of the 2020 IEEE Student Conference on Research and Development (SCOReD), Batu Pahat, Malaysia, 27–29 September 2020; pp. 438–442. [Google Scholar]
Ikhlayel, M.; Iswara, A.J.; Kurniawan, A.; Zaini, A.; Yuniarno, E.M. Traffic Sign Detection for Navigation of Autonomous Car Prototype using Convolutional Neural Network. In Proceedings of the 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 17–18 October 2020; pp. 205–210. [Google Scholar]
Jeya Visshwak, J.; Saravanakumar, P.; Minu, R. On-The-Fly Traffic Sign Image Labeling. In Proceedings of the 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 28–30 July 2020; pp. 0530–0532. [Google Scholar]
Valeja, Y.; Pathare, S.; Patel, D.; Pawar, M. Traffic Sign Detection using CLARA and YOLO in Python. In Proceedings of the 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 19–20 March 2021; Volume 1, pp. 367–371. [Google Scholar]
Shahud, M.; Bajracharya, J.; Praneetpolgrang, P.; Petcharee, S. Thai traffic sign detection and recognition using convolutional neural networks. In Proceedings of the 2018 22nd International Computer Science and Engineering Conference (ICSEC), Chiang Mai, Thailand, 21–24 November 2018; pp. 1–5. [Google Scholar]
Chen, Y.; Wang, J.; Dong, Z.; Yang, Y.; Luo, Q.; Gao, M. An Attention Based YOLOv5 Network for Small Traffic Sign Recognition. In Proceedings of the 2022 IEEE 31st International Symposium on Industrial Electronics (ISIE), Anchorage, AK, USA, 1–3 June 2022; pp. 1158–1164. [Google Scholar]
Park, Y.K.; Park, H.; Woo, Y.S.; Choi, I.G.; Han, S.S. Traffic Landmark Matching Framework for HD-Map Update: Dataset Training Case Study. Electronics 2022, 11, 863. [Google Scholar] [CrossRef]
Rehman, Y.; Amanullah, H.; Saqib Bhatti, D.M.; Toor, W.T.; Ahmad, M.; Mazzara, M. Detection of Small Size Traffic Signs Using Regressive Anchor Box Selection and DBL Layer Tweaking in YOLOv3. Appl. Sci. 2021, 11, 11555. [Google Scholar] [CrossRef]
Song, S.; Li, Y.; Huang, Q.; Li, G. A new real-time detection and tracking method in videos for small target traffic signs. Appl. Sci. 2021, 11, 3061. [Google Scholar] [CrossRef]
Zhang, S.; Che, S.; Liu, Z.; Zhang, X. A real-time and lightweight traffic sign detection method based on ghost-YOLO. In Multimedia Tools and Applications; Springer: Berlin/Heidelberg, Germany, 2023; Volume 82, pp. 26063–26087. [Google Scholar]
Dewi, C.; Chen, R.C.; Tai, S.K. Evaluation of robust spatial pyramid pooling based on convolutional neural network for traffic sign recognition system. Electronics 2020, 9, 889. [Google Scholar] [CrossRef]
Huang, H.; Liang, Q.; Luo, D.; Lee, D.H. Attention-Enhanced One-Stage Algorithm for Traffic Sign Detection and Recognition. J. Sens. 2022, 2022, 3705256. [Google Scholar] [CrossRef]
Wan, H.; Gao, L.; Su, M.; You, Q.; Qu, H.; Sun, Q. A novel neural network model for traffic sign detection and recognition under extreme conditions. J. Sens. 2021, 2021, 9984787. [Google Scholar] [CrossRef]
Zhu, Y.; Yan, W.Q. Traffic sign recognition based on deep learning. Multimed. Tools Appl. 2022, 81, 17779–17791. [Google Scholar] [CrossRef]
Li, Y.; Li, J.; Meng, P. Attention-YOLOV4: A real-time and high-accurate traffic sign detection algorithm. Multimed. Tools Appl. 2022, 82, 7567–7582. [Google Scholar] [CrossRef]
Zhang, Y.; Lu, Y.; Zhu, W.; Wei, X.; Wei, Z. Traffic sign detection based on multi-scale feature extraction and cascade feature fusion. J. Supercomput. 2022, 79, 2137–2152. [Google Scholar] [CrossRef]
Yang, T.; Tong, C. Real-time detection network for tiny traffic sign using multi-scale attention module. Sci. China Technol. Sci. 2022, 65, 396–406. [Google Scholar] [CrossRef]
Xing, J.; Nguyen, M.; Qi Yan, W. The Improved Framework for Traffic Sign Recognition Using Guided Image Filtering. SN Comput. Sci. 2022, 3, 1–16. [Google Scholar] [CrossRef]
Bi, Z.; Xu, F.; Shan, M.; Yu, L. YOLO-RFB: An Improved Traffic Sign Detection Model. In Proceedings of the International Conference on Mobile Computing, Applications, and Services, Online, 22–24 July 2022; pp. 3–18. [Google Scholar]
Zeng, H. Real-Time Traffic Sign Detection Based on Improved YOLO V3. In Proceedings of the 11th International Conference on Computer Engineering and Networks, Hechi, China, 21–25 October 2022; pp. 167–172. [Google Scholar]
Galgali, R.; Punagin, S.; Iyer, N. Traffic Sign Detection and Recognition for Hazy Images: ADAS. In Proceedings of the International Conference on Image Processing and Capsule Networks, Bangkok, Thailand, 27–28 May 2021; pp. 650–661. [Google Scholar]
Mangshor, N.N.A.; Paudzi, N.P.A.M.; Ibrahim, S.; Sabri, N. A Real-Time Malaysian Traffic Sign Recognition Using YOLO Algorithm. In Proceedings of the 12th National Technical Seminar on Unmanned System Technology 2020, Virtual, 27–28 October 2022; pp. 283–293. [Google Scholar]
Le, B.L.; Lam, G.H.; Nguyen, X.V.; Duong, Q.L.; Tran, Q.D.; Do, T.H.; Dao, N.N. A Deep Learning Based Traffic Sign Detection for Intelligent Transportation Systems. In Proceedings of the International Conference on Computational Data and Social Networks, Online, 15–17 November 2021; pp. 129–137. [Google Scholar]
Ma, L.; Wu, Q.; Zhan, Y.; Liu, B.; Wang, X. Traffic Sign Detection Based on Improved YOLOv3 in Foggy Environment. In Proceedings of the International Conference on Wireless Communications, Networking and Applications, Wuhan, China, 16–18 December 2022; pp. 685–695. [Google Scholar]
Fan, W.; Yi, N.; Hu, Y. A Traffic Sign Recognition Method Based on Improved YOLOv3. In Proceedings of the International Conference on Intelligent Automation and Soft Computing, Chicago, IL, USA, 26–28 May 2021; pp. 846–853. [Google Scholar]
Tao, X.; Li, H.; Deng, L. Research on Self-driving Lane and Traffic Marker Recognition Based on Deep Learning. In Proceedings of the International Symposium on Intelligence Computation and Applications, Giangzhou, China, 20–21 November 2021; pp. 112–123. [Google Scholar]
Dewi, C.; Chen, R.C.; Yu, H.; Jiang, X. Robust detection method for improving small traffic sign recognition based on spatial pyramid pooling. J. Ambient. Intell. Humaniz. Comput. 2021, 14, 8135–8152. [Google Scholar] [CrossRef]
Wan, J.; Ding, W.; Zhu, H.; Xia, M.; Huang, Z.; Tian, L.; Zhu, Y.; Wang, H. An efficient small traffic sign detection method based on YOLOv3. J. Signal Process. Syst. 2021, 93, 899–911. [Google Scholar] [CrossRef]
Peng, E.; Chen, F.; Song, X. Traffic sign detection with convolutional neural networks. In Proceedings of the International Conference on Cognitive Systems and Signal Processing, Beijing, China, 19–23 November 2016; pp. 214–224. [Google Scholar]
Thipsanthia, P.; Chamchong, R.; Songram, P. Road Sign Detection and Recognition of Thai Traffic Based on YOLOv3. In Proceedings of the International Conference on Multi-disciplinary Trends in Artificial Intelligence, Kuala Lumpur, Malaysia, 17–19 November 2019; pp. 271–279. [Google Scholar]
Arcos-Garcia, A.; Alvarez-Garcia, J.A.; Soria-Morillo, L.M. Evaluation of deep neural networks for traffic sign detection systems. Neurocomputing 2018, 316, 332–344. [Google Scholar] [CrossRef]
Ma, X.; Li, X.; Tang, X.; Zhang, B.; Yao, R.; Lu, J. Deconvolution Feature Fusion for traffic signs detection in 5G driven unmanned vehicle. Phys. Commun. 2021, 47, 101375. [Google Scholar] [CrossRef]
Khnissi, K.; Jabeur, C.B.; Seddik, H. Implementation of a Compact Traffic Signs Recognition System Using a New Squeezed YOLO. Int. J. Intell. Transp. Syst. Res. 2022, 20, 466–482. [Google Scholar] [CrossRef]
Khafaji, Y.A.A.; Abbadi, N.K.E. Traffic Signs Detection and Recognition Using A combination of YOLO and CNN. In Proceedings of the International Conference on Communication & Information Technologies (IICCIT-2022), Basrah, Iraq, 7–8 September 2022. [Google Scholar]
Yan, B.; Li, J.; Yang, Z.; Zhang, X.; Hao, X. AIE-YOLO: Auxiliary Information Enhanced YOLO for Small Object Detection. Sensors 2022, 22, 8221. [Google Scholar] [CrossRef]
He, X.; Cheng, R.; Zheng, Z.; Wang, Z. Small Object Detection in Traffic Scenes Based on YOLO-MXANet. Sensors 2021, 21, 7422. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Guo, J.; Yi, J.; Song, Y.; Xu, J.; Yan, W.; Fu, X. Real-Time and Efficient Multi-Scale Traffic Sign Detection Method for Driverless Cars. Sensors 2022, 22, 6930. [Google Scholar] [CrossRef]
Wang, Y.; Bai, M.; Wang, M.; Zhao, F.; Guo, J. Multiscale Traffic Sign Detection Method in Complex Environment Based on YOLOv4. Comput. Intell. Neurosci. 2022, 2022, 15. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Dong, Z.; Gao, M. Improved YOLOv5 network for real-time multi-scale traffic sign detection. Neural Comput. Appl. 2021, 35, 7853–7865. [Google Scholar] [CrossRef]
Yao, Y.; Han, L.; Du, C.; Xu, X.; Jiang, X. Traffic sign detection algorithm based on improved YOLOv4-Tiny. Signal Process. Image Commun. 2022, 107, 116783. [Google Scholar] [CrossRef]
Doherty, J.; Gardiner, B.; Kerr, E.; Siddique, N.; Manvi, S.S. Comparative Study of Activation Functions and Their Impact on the YOLOv5 Object Detection Model. In Proceedings of the International Conference on Pattern Recognition and Artificial Intelligence, Paris, France, 1–3 June 2022; pp. 40–52. [Google Scholar]
Lu, H.; Chen, T.; Shi, L. Research on Small Target Detection Method of Traffic Signs Improved Based on YOLOv3. In Proceedings of the 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 14–16 January 2022; pp. 476–481. [Google Scholar]
Algorry, A.M.; García, A.G.; Wofmann, A.G. Real-time object detection and classification of small and similar figures in image processing. In Proceedings of the 2017 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 14–16 December 2017; pp. 516–519. [Google Scholar]
Kuo, Y.L.; Lin, S.H. Applications of deep learning to road sign detection in DVR images. In Proceedings of the 2019 IEEE International Symposium on Measurement and Control in Robotics (ISMCR), Houston, TX, USA, 19–21 September 2019; p. A2-1. [Google Scholar]
Abraham, A.; Purwanto, D.; Kusuma, H. Traffic Lights and Traffic Signs Detection System Using Modified You Only Look Once. In Proceedings of the 2021 International Seminar on Intelligent Technology and Its Applications (ISITIA), Surabaya, Indonesia, 21–22 July 2021; pp. 141–146. [Google Scholar]

Figure 1. General scheme of a traffic sign detection and recognition system using the YOLO object detection algorithm. The central image has been taken from [18].

Figure 2. Scatter plots comparing FPS versus mAP and ACC versus mAP metrics, extracted from scientific papers (2016–2022) to evaluate the performance of traffic sign detection and recognition systems using YOLO. They consider average values.

Figure 3. Box plots depicting the mAP, ACC, and FPS metrics extracted from scientific papers (2016–2022), which evaluate the performance of traffic sign detection and recognition systems.

Figure 4. Bar charts of evaluation metrics used in scientific papers to assess the quality of traffic sign detection and recognition systems using YOLO in the period 2016–2022.

Figure 5. Horizontal bar charts of the frequency of computer systems present in scientific papers focused on traffic sign detection and recognition systems by using YOLO in the period 2016–2022.

Table 1. Search strings formulated with the keywords for each digital library.

Databases	Search Strings	Results
IEEE	((“All Metadata”: Traffic Sign) AND ((“All Metadata”:Detection) OR (“All Metadata”:Recognition) OR (“All Metadata”:Identification)) AND (“All Metadata”: Object Detection) OR (“All Metadata”:YOLO))	2722
Springer	‘Object AND Detection AND “Traffic Sign” AND (Detection OR Recognition OR Identification OR YOLO)’	1852
MDPI	All Fields: Traffic Sign Detection OR All Fields: Traffic Sign Recognition OR All Fields: Traffic Sign Identification AND Keywords: You Only Look Once OR Keywords: Object Detection	498
Hindawi P.G.	(“Traffic Sign” AND (“Detection” OR “Recognition” OR “Identification”)) AND (“YOLO”)	16
Science Direct	(Traffic Sign OR Traffic Sign Detection OR Traffic Sign Recognition OR Traffic Sign Identification OR Object Detection) AND (YOLO OR You Only Look Once)	80,650
Wiley	“Traffic Sign OR Detection OR Recognition OR Identification” anywhere and “YOLO OR Object Detection” anywhere	160,803
Sage	Traffic Sign OR Detection OR Recognition OR Identification AND Object Detection OR YOLO	72,941
Taylor & Francis	[All: traffic] AND [[All: sign] OR [All: detection] OR [All: recognition] OR [All: identification]] AND [[All: object] OR [All: detection] OR [All: yolo]]	135,385
PLOS	((((everything: “Traffic Sign”) AND everything:Detection) OR everything:Identification) OR everything:YOLO) OR everything: “Object Detection”	140,023

Table 2. Summary of the traffic sign datasets extracted from scientific papers and employed in the development of traffic sign detection and recognition systems using YOLO in the period 2016–2022.

Order	Country of Origin	Name of Dataset	Number of Categories	Number of Classes	Number of Images	Number of Traffic Signs	Number of Referenced Articles	Percentage (%)
1	Germany	GTSRB and	+3	43	52,740	52,740	44	27.33
		GTSDB			900	910
2	China	TT100K	3	130	100,000	30,000	26	16.15
3	China	CTSDB			10,000		21	13.04
		CCTSDB	3	21	∼20,000	∼40,000
4	Belgium	BTSDB and	3	62	17,000	4627	6	3.73
		BTSCB			7125	7125
5	South Korea	KTSD	-	-	3300	-	3	1.86
6	Malaysia	MTSD	5	66	1000		2	1.86
					2056	2056
7	USA	BDD1OOK	-	-	100,000	-	2	1.86
8	Thailand	TTSD	-	50	9357	-	2	1.86
9	France	FRIDA and	-		90	-	2	1.86
		FRIDA2	-		300
10	France	FROSI	-	4	504	1620	2	1.86
11	Sweden	STSD	-	7	20,000	3488	2	1.86
12	Slovenia	DFG	+3	+200	7000	13,239	1	0.62
						4359
13	Taiwan	TWTSD	-	-	900	-	1	0.62
14	Taiwan	TWynthetic	-	3	-	900	1	0.62
15	Belgium	KUL	-	+100	+10,000	-	1	0.62
16	China	CSUST	-	-	15,000	-	1	0.62
17		MarcTR	-	7	3564	3564	1	0.62
18	Turkey	-		22		2500	1	0.62
19	Vietnam	-	4	144	5000	5704	1	0.62
20	Croatia	-	-	11	5567	6751	1	0.62
21	Mexico	-	3	11	1284	1426	1	0.62
22	China	WHUTCTSD	5	-	2700	4009	1	0.62
23	Bangladesh	BDRS2021	4	16	2688	-	1	0.62
24	New Zealand	NZ-TS3K	3	7	3436	3545	1	0.62
25	AW ¹	MapiTSD	-	300	100,000	320,000	1	0.62
26	SW ²	Own	-	-	-	-	14	8.70
27	SW	Unknown		-	-	-	21	13.0
Total							161	100.00

^{1}

AW: around the world;

^{2}

SW: somewhere in the world.

Table 3. Thorough analysis of challenges and corresponding solutions explored in scientific papers addressing traffic sign detection systems in the period 2016–2022.

Challenge	Sol1	Sol2	Sol3
CH1 Fluctuation in lighting conditions	[13,18,35,39,46,49,50,52,59,61,62,65,67,71,76,77,78,80,81,82,85,89,90,97,98,100,101,105,106,109,113,114,117,124,125,127,130,134,136,141]	[7,12,36,42,43,47,48,54,55,56,57,58,60,63,64,66,68,70,72,73,74,84,86,91,92,99,104,107,110,112,115,122,123,128,137,138]	[8,9,37,53,69,79,87,88,93,94,96,108,118,121,126,129,139]
CH2 Adverse weather conditions	[10,13,18,22,35,39,46,49,50,51,52,61,62,65,67,71,76,77,78,81,82,83,85,90,98,100,103,105,109,113,114,117,120,124,125,127,130,134,136,141]	[7,11,36,38,40,41,42,43,45,47,54,55,58,63,64,66,68,73,84,91,92,95,97,104,112,115,128,133,140]	–
CH3 Partial occlusion	[13,18,22,35,46,51,52,61,62,65,67,78,80,82,85,90,100,101,103,109,127,130]	[7,36,40,41,43,45,47,56,57,58,60,64,66,70,73,74,84,86,95,99,104,106,107,110,112,113,119,123,132,134,137,138]	–
CH4 Sign Damage	[13,18,35,61,65,70,85,101]	[41,45,63,66,74,84,99,110,113,123]	–
CH5 Complex Scenarios	[13,18,35,38,39,51,52,62,76,78,80,81,83,89,90,95,103,105,109,111,112,116,124,130,134]	[41,48,55,60,66,74,84,102,123,131,133]	–
CH6 Inadequate image quality	[13,18,22,35,39,42,62,64,70,71,78,81,82,83,105,109,114,117,120,124,130,134]	[7,36,40,41,45,47,58,63,66,84,91,106,113,123,131,135,137]	–
CH7 Geographic regions	–	[36,38,56,57,60,75,92]	–

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Flores-Calero, M.; Astudillo, C.A.; Guevara, D.; Maza, J.; Lita, B.S.; Defaz, B.; Ante, J.S.; Zabala-Blanco, D.; Armingol Moreno, J.M. Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review. Mathematics 2024, 12, 297. https://doi.org/10.3390/math12020297

AMA Style

Flores-Calero M, Astudillo CA, Guevara D, Maza J, Lita BS, Defaz B, Ante JS, Zabala-Blanco D, Armingol Moreno JM. Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review. Mathematics. 2024; 12(2):297. https://doi.org/10.3390/math12020297

Chicago/Turabian Style

Flores-Calero, Marco, César A. Astudillo, Diego Guevara, Jessica Maza, Bryan S. Lita, Bryan Defaz, Juan S. Ante, David Zabala-Blanco, and José María Armingol Moreno. 2024. "Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review" Mathematics 12, no. 2: 297. https://doi.org/10.3390/math12020297

APA Style

Flores-Calero, M., Astudillo, C. A., Guevara, D., Maza, J., Lita, B. S., Defaz, B., Ante, J. S., Zabala-Blanco, D., & Armingol Moreno, J. M. (2024). Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review. Mathematics, 12(2), 297. https://doi.org/10.3390/math12020297

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Traffic Signs

2.2. YOLO Object Detection Algorithm

2.3. State of the Art

2.4. Systematic Literature Review Methodology

2.5. Research Questions

2.6. Search Strategy

2.6.1. Databases of Digital Library

2.6.2. Timeframe of Study

2.6.3. Keywords

2.6.4. Inclusion and Exclusion Criteria

2.6.5. Study Selection

2.7. Data Extraction

2.8. Data Synthesis

3. Results Based on RQs

3.1. RQ1 [Applications]: What Are the Main Applications of Traffic Sign Detection and Recognition Using YOLO?

3.2. RQ2 [Datasets]: What Traffic Sign Datasets Are Used to Train, Validate, and Test These Systems?

3.3. RQ3 [Metrics]: What Metrics Are Used to Measure the Quality of Object Detection in the Context of Traffic Sign Detection and Recognition Using YOLO?

3.4. Comparing Metrics among Different Versions of YOLO

3.5. RQ4 [Hardware]: What Hardware Is Used to Implement Traffic Sign Recognition and Detection Systems Based on YOLO?

3.6. RQ5 [Challenges]: What Are the Challenges Encountered in the Detection and Recognition of Traffic Signs Using YOLO?

3.7. Discussion

3.8. Possible Threats to SLR Validation

3.9. Construct Validity Threat

3.9.1. The Influence of Version Bias on Metrics

3.9.2. Geographic Diversity in Sign Datasets

3.9.3. Hardware Discrepancies

3.9.4. Limitations of Statistical Metrics

3.9.5. Threats to Internal Validation

3.9.6. Threats to External Validation

3.9.7. Threats to the Validation of the Conclusions

3.10. Future Research Directions

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI