PRISMA Review: Drones and AI in Inventory Creation of Signage

Satama-Bermeo, Geovanny; Lopez-Guede, Jose Manuel; Rahebi, Javad; Teso-Fz-Betoño, Daniel; Boyano, Ana; Akizu-Gardoki, Ortzi

doi:10.3390/drones9030221

Open AccessReview

PRISMA Review: Drones and AI in Inventory Creation of Signage

by

Geovanny Satama-Bermeo

¹

,

Jose Manuel Lopez-Guede

^1,*

,

Javad Rahebi

²

,

Daniel Teso-Fz-Betoño

³

,

Ana Boyano

⁴

and

Ortzi Akizu-Gardoki

⁵

¹

Department of System Engineering and Automation Control, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain

²

Department of Software Engineering, Istanbul Topkapi University, 34087 Istanbul, Türkiye

³

Department of Electrical Engineering, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain

⁴

Department of Mechanical Engineering, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain

⁵

Life Cycle Thinking Group, Department of Graphic Design and Engineering Projects, University of the Basque Country (UPV/EHU), Plaza Ingeniero Torres Quevedo 1, 48013 Bilbao, Spain

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(3), 221; https://doi.org/10.3390/drones9030221

Submission received: 26 January 2025 / Revised: 13 March 2025 / Accepted: 14 March 2025 / Published: 19 March 2025

(This article belongs to the Special Issue Advances in Deep Learning for Drones and Its Applications: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

This systematic review explores the integration of unmanned aerial vehicles (UAVs) and artificial intelligence (AI) in automating road signage inventory creation, employing the preferred reporting items for systematic reviews and meta-analyses (PRISMA) methodology to analyze recent advancements. The study evaluates cutting-edge technologies, including UAVs equipped with deep learning algorithms and advanced sensors like light detection and ranging (LiDAR) and multispectral cameras, highlighting their roles in enhancing traffic sign detection and classification. Key challenges include detecting minor or partially obscured signs and adapting to diverse environmental conditions. The findings reveal significant progress in automation, with notable improvements in accuracy, efficiency, and real-time processing capabilities. However, limitations such as computational demands and environmental variability persist. By providing a comprehensive synthesis of current methodologies and performance metrics, this review establishes a robust foundation for future research to advance automated road infrastructure management to improve safety and operational efficiency in urban and rural settings.

Keywords:

LiDAR; UAVs; Yolo; R-CNN; AI; data fusion

1. Introduction

The rapid expansion of urban areas [1,2] and the increase in traffic pose a growing challenge for the efficient management of road signage, an essential component to ensuring safety and proper traffic flow in cities and rural areas [3]. The supervision and maintenance of road signs, which have historically relied on manual methods, currently face multiple limitations. These traditional methods are slow, costly, and prone to human error [4], resulting in incomplete coverage and lower accuracy in identifying and recording the condition of signs [5]. These limitations, combined with the complexity of urban infrastructures [1,2], make it necessary to develop alternatives capable of managing signage inventories in an automated and real-time manner, a critical task for reducing accidents and maintaining well-regulated traffic infrastructures [6].

While in-vehicle systems have been extensively studied for detecting road markings, they face inherent limitations that hinder their effectiveness in specific scenarios [4,5]. These systems are constrained by a limited field of view, which restricts their ability to identify markings obscured by obstacles such as vegetation, parked vehicles, or road debris [6]. Additionally, their reliance on traffic conditions can impede data collection in congested or inaccessible areas, reducing overall coverage and accuracy [7,8]. Unmanned aerial vehicles (UAVs), on the other hand, offer a unique aerial perspective that allows for comprehensive monitoring of large areas without being affected by ground-level obstructions [9,10]. Equipped with advanced sensors like light detection and ranging LiDAR and multispectral cameras, drones can detect road markings under diverse environmental conditions, including degraded or partially hidden markings. This capability addresses critical gaps left by in-vehicle systems and provides a robust framework for enhancing road infrastructure management through automated detection and classification [11,12,13,14].

In this context, the use of UAVs equipped with artificial intelligence (AI) technologies [15] has emerged as an innovative and efficient solution to improve automation and accuracy in the creation of road sign inventories. These UAVs, integrated with advanced computer vision algorithms [16,17,18,19,20], can capture high-resolution aerial images [17,21] and process large volumes of data, facilitating the monitoring of vast areas in significantly less time compared to manual inspections [22]. The development of detection and classification algorithms, such as You Only Look Once version 4 (YOLOv4) [18,23,24] and Faster Regions with Convolutional Neural Network (Faster R-CNN) [25,26,27], has shown high performance in identifying traffic signs, even in densely urbanized and complex environments [28]. These algorithms not only optimize inspection time and reduce costs, but they also improve the accuracy and frequency of updates in signage inventories [23].

Despite these advances, technical and operational challenges must be overcome to achieve a fully automated and reliable road sign monitoring system. The accuracy of UAVs and their AI algorithms is affected by factors such as variability in environmental conditions [29] and the need to detect small or partially hidden objects [30]. This task is particularly challenging in urban contexts, where signs can be covered by vegetation, buildings, or other obstacles [29,31]. Recent research has explored the use of advanced image segmentation techniques [32,33], sensors such as LiDAR [11,12,13], and multispectral cameras [14] to improve the quality and accuracy of sign detection and classification. These innovations are helping to overcome critical limitations, but there is still a need for comprehensive and systematic evaluation to compare their strengths and limitations in real operational scenarios.

This study offers a systematic review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology [34], which has been applied to rigorously select and analyze relevant research on the automation of signage inventories through UAVs and AI. This review has organized the studies into nine key categories covering different technical and operational aspects. These categories range from the performance of detection [7] and segmentation algorithms [32,33] to multisensor integration [35,36,37,38] and the use of deep learning models (DLMs) [39,40], among others. Classifying them into nine sections allows for a detailed exploration of each relevant aspect in using UAVs and AI, providing a comprehensive view of current technologies and highlighting areas with significant technical challenges.

To facilitate understanding and comparative evaluation of the results, this article includes summary tables that present the performance, accuracy, and mean average precision (mAP) metrics of the algorithms and techniques reviewed [8,41]. These metrics are essential for assessing the performance of each model and its ability to adapt to different environments and operational needs. The tables visualize the quantitative results of each approach, highlighting models that have demonstrated greater effectiveness in detecting and classifying signage. This comparative analysis is crucial for researchers and professionals to understand each method’s strengths and limitations and identify approaches that best meet current demands for automation in road monitoring.

The relevance of this study is also underscored by its ability to contribute to the current scientific landscape, where the literature on the use of UAVs and AI in road signage [19,37,41,42], although significant, still presents limitations in terms of systematization and comprehensive comparison. While noteworthy research has been on detection algorithms and multisensor integration techniques [14], many focus only on specific cases or controlled scenarios, making it difficult to generalize results to different operational contexts. Additionally, not all studies consider, in a holistic way, the practical and technical challenges [15], such as environmental variability [5], differences in sign design among regions, and the interoperability of emerging technologies. By applying the PRISMA methodology, this work seeks to fill that gap by providing a structured and complete perspective that not only consolidates existing advances but also identifies areas of opportunity where further research is needed to address persistent challenges. By offering a critical and well-founded synthesis, this study establishes an essential reference framework for scientific and technological advancement in the automation of road signage inventories.

With the exhaustive classification of studies, the analysis of comparative metrics, and the focus on state-of-the-art methodologies, this article aims to provide a comprehensive guide for researchers and professionals interested in the automation of signage inventories, establishing a clear path toward the development of road monitoring systems that meet the safety and efficiency requirements of urban and rural traffic [41].

Figure 1 presents a comprehensive workflow for automating the road signage inventory using UAVs and advanced AI techniques. This process is organized into phases that begin with data capture, where UAVs equipped with cameras and specific sensors record high-resolution images [17,21] and other geospatial data of road infrastructures. The next phase is image preprocessing, in which software tools optimize the captured images, ensuring clarity for subsequent analysis. Advanced algorithms are implemented to automatically identify and classify signs present in the environment when detecting and classifying signage. Finally, the processed data are integrated into a reporting and analysis system, providing a detailed and up-to-date inventory that can be used for monitoring and maintaining road signage. This automated workflow enables more efficient, accurate, and less costly management than traditional manual inspection methods.

The rest of this article is organized as follows: Section 2 focuses on the PRISMA methodology, detailing the systematic search, selection, and analysis of relevant studies on the automated generation of road signage inventories using UAVs and AI. This section is technical and methodological, describing in detail the inclusion and exclusion criteria and the stages of the workflow used to ensure the reproducibility of the analysis. In Section 3, the theoretical framework is developed, a more conceptual and literary part that examines the fundamentals of AI, autonomous decision-making, and the integration of UAVs with AI technologies, providing context to understand the basic principles behind the reviewed applications. Section 4 addresses the results with a quantitative and analytical focus. Here, metrics of performance of automated detection and classification models for signage are presented, such as accuracy and mAP, supported by graphs, tables, and comparative analyses illustrating the most advanced applications and their effectiveness in practical scenarios. Section 5 offers a reflective discussion on the findings, highlighting the strengths and the limitations of the analyzed studies and presenting a critical view of the technical and operational challenges that still need to be overcome to achieve optimal automation. Finally, Section 6 synthesizes the article’s key findings and suggests specific directions for future research, emphasizing the areas with the most significant potential for developing more robust and effective technologies.

2. Systematic Review Methodology

The PRISMA methodology was used to carry out this systematic review [34]. This standardized approach ensures transparency and thoroughness at each stage of the review process, from identifying and selecting relevant studies to analyzing the results. Applying the PRISMA methodology allows for precise and reproducible documentation of the process, ensuring that decisions made during the search and filtering of information are easily verifiable and justified. It also facilitates the identification of knowledge gaps and the establishment of solid conclusions based on the evidence collected.

Figure 2 provides a detailed summary of the stages followed in this systematic review to identify, select, and analyze relevant studies on the automation of road sign inventories using UAVs and AI. In the first stage, called identification (Literal Section 2.1), an exhaustive search strategy was designed using specific keywords and Boolean operators. In the second stage, review (Literal Section 2.2), the titles and abstracts of the 523 articles were manually evaluated by applying the defined inclusion and exclusion criteria. At this stage, two main categories were distinguished: (1) records that contributed to the theoretical framework and the general context of the state of the art and (2) studies that provided key empirical data to support the quantitative findings. Finally, in the inclusion phase (Section 3 and Section 4), the 47 most relevant studies were integrated into the systematic analysis, distributed among those aimed at building the theoretical context (96 records) and those that offered more detailed empirical analyses (57 records). This methodological approach, supported by the use of the PRISMA diagram, ensures transparency, reproducibility, and rigor in the process, providing a comprehensive view of advances and challenges in automating road sign inventories using UAVs and AI.

The objective of this study was to answer a series of key questions addressing the leading technical and operational aspects of automated road signage detection using UAVs and AI. These questions were posed to identify the most effective technologies and the inherent challenges in their practical implementation.

P1: Which UAVs and AI-based technologies are most effective for the automated detection of road signage?

P2: Which image processing algorithms and deep learning techniques are the most accurate in urban and rural contexts?

P3: What are the main technical challenges in implementing these solutions in real-world scenarios?

2.1. Identification

The search strategy was designed to identify relevant studies addressing the automated generation of signage inventories through UAVs and AI. Comprehensive searches were carried out in three recognized scientific databases: Web of Science [43], ScienceDirect [44], and Scopus [45]. These databases were chosen for their broad coverage in technological and scientific fields and their ability to index high-quality, peer-reviewed studies.

To ensure the relevance of identified studies, combinations of keywords were used, such as “traffic sign”, “drone”, “object detection”, “deep learning”, and “machine learning”. Boolean operators (AND, OR) were also employed to broaden or narrow the search. The search focused on studies published between 2014 and 2024 in order to include the most recent and relevant developments in these areas, as illustrated in Figure 3. Table 1 shows the equations used for the search in the different databases.

The initial search yielded a total of 533 studies—14 from Web of Science, 248 from Science Direct, and 271 from Scopus—which ensured broad coverage of the existing literature in this field.

2.2. Study Review

The studies were selected in two phases: an initial review of titles and abstracts, followed by a comprehensive evaluation of the full text of potentially relevant studies.

2.2.1. Inclusion and Exclusion Criteria

The inclusion criteria were (1) empirical studies addressing the use of UAVs [15] for automated road signage detection; (2) studies applying AI techniques, such as deep learning, for the detection or classification of signage; and (3) studies published between 2014 and 2024 in peer-reviewed scientific journals. Conversely, exclusion criteria were established to discard studies that were not empirical (for example, literature reviews or opinion articles) [34], studies that did not present concrete experimental results, and those focusing on UAVs and AI applications outside the context of road signage, such as agriculture or surveillance. Duplicate studies and those without full access were also excluded.

To address the impact of variability in experimental settings on the comparability of results, we ensured that only empirical studies providing detailed descriptions of their experimental setups—such as dataset size, hardware configurations, and environmental conditions—were included. This allowed us to focus on research with clearly defined methodologies and thereby minimize inconsistencies. However, we acknowledge that differences in these variables across the selected studies may influence reported performance metrics like accuracy or mAP. While this variability is inherent when synthesizing results from diverse sources, prioritizing studies with comprehensive experimental details helps mitigate its effect and ensures a more robust comparative analysis.

2.2.2. Application of Criteria

The study selection process was meticulously organized to identify relevant, high-quality research on the automation of road signage inventories using UAVs and AI. Several phases were carried out to ensure methodological rigor and minimize biases.

In the first phase, called identification, an exhaustive database search was conducted, yielding 533 articles related to UAVs, AI algorithms, and image-processing techniques applied to road signage. Specifically, 14 articles were found in Web of Science, 248 in ScienceDirect, and 271 in Scopus. Excel was used to consolidate the results, leaving 523 non-duplicate articles after removing 10 duplicates.

During the review phase, inclusion and exclusion criteria were applied to the titles and abstracts of the 523 records. Each study was manually examined to verify its relevance to using UAVs and AI in road signage. As a result, 370 articles were removed because they had objectives different from the proposed study, lacked empirical data, contained irrelevant applications, or did not have full-text access, leaving 153 for detailed evaluation.

In the eligibility phase, the 153 studies were reviewed and classified according to their relevance and applicability to the research topic. In the final inclusion phase, these studies were divided into two main categories: 96 records intended for context review, providing the essential theoretical framework for the state of the art, and 57 records for detailed analysis, which were thoroughly examined to extract key empirical data and generate fundamental research findings. This process was documented using a PRISMA flow diagram to ensure transparency and reproducibility.

3. State of the Art

Using UAVs and AI in the automated management of road signage inventories has opened new possibilities for efficiency and precision in road infrastructure. This chapter examines recent developments in the key areas underpinning this technology: UAVs and their components, automated inventory systems in road infrastructure, detection and classification of signage, semantic segmentation and aerial image processing, multisensor data fusion, and UAVs applications in road safety.

3.1. UAVs for Signage Detection

UAVs have emerged as a valuable tool in monitoring and managing road infrastructure, thanks to their ability to capture detailed aerial images of large areas in a relatively short time [9,46]. The capacity of UAVs to fly over complex and difficult-to-access areas allows for comprehensive supervision of road signage, especially in dense urban or extensive rural areas [1]. Their role in signage detection has been strengthened by integrating advanced components that maximize detection accuracy and speed [8,41].

Among the most important components for signage detection with UAVs are high-resolution red, green, and blue (RGB) camera systems [46], LiDAR sensors, and multispectral cameras [14]. RGB cameras [10] capture high-definition color images that accurately identify traffic signs and other road elements. These systems are essential for computer vision applications [17,18,19,20], as they enable the detection of important details, such as the color and shape of signs. LiDAR sensors, on the other hand, provide depth data and allow for the creation of three-dimensional models of the environment, which is crucial for accurately mapping infrastructure and recognizing the exact position of signage about its surroundings [47,48].

Multispectral cameras [14] are another critical component, as they provide data across multiple spectral bands, facilitating detection in low visibility conditions, such as in shadowed environments or reduced lighting [49,50,51]. This multispectral capability is handy for improving signage classification accuracy and adapting the system to various environmental conditions [42,52]. The combination of these components with AI algorithms allows UAVs to perform rapid and accurate detections [23,53], optimizing the process of signage inventory and improving road safety, as illustrated in Figure 4.

3.2. Applications of UAVs for Road Safety

The use of UAVs for road sign detection and improving safety in transportation infrastructure has been a constantly expanding field of research [7,41]. The ability of UAVs to capture high-resolution aerial images [21,54,55] and their integration with AI techniques has revolutionized the way road infrastructure is monitored and managed, allowing for more accurate detection of traffic signs and potential safety risks [56].

One of the most significant advances in this area is using UAVs for safety assessment at road intersections through aerial image analysis [3]. In [57], a study on safety evaluation of roundabouts using UAV photography and the traffic conflicts technique is presented. This approach allows for identifying potential conflict points at intersections, which may vary depending on the type of intersection. For example, roundabouts often involve merging and yielding dynamics distinct from simple crossings, which affect how conflicts are identified and addressed. Additionally, these conflict points are closely tied to the condition and placement of road signage. UAVs equipped with advanced sensors can detect issues such as obscured or damaged signs, which directly contribute to traffic conflicts by creating driver uncertainty [7]. Similarly, Ref. [41] emphasizes that this methodology significantly improves upon traditional observational techniques by reducing human error and increasing efficiency in data collection. The precision with which UAVs can capture this data in real-time represents a significant improvement compared to traditional observation and analysis methods [7,41].

Automated road sign detection has been another field that has benefited from the use of UAVs and computer vision techniques [16,17,18]. In [29], UAVs-based systems for traffic sign detection using deep learning algorithms are described [19,20]. This system, which combines aerial images captured by UAVs with convolutional neural networks (CNNs), allows for automatic identification of traffic signs in urban and rural areas [25,29]. The results show that this approach improves efficiency and accuracy in road sign detection, which is crucial for proactive maintenance of road infrastructure and traffic safety [29].

Furthermore, in [22], the use of UAVs for creating road sign inventories in rural areas is investigated. UAVs allow for capturing detailed images of traffic signs, even in hard-to-reach areas, facilitating monitoring and maintaining road infrastructure in remote zones [4,17]. The use of UAVs for these tasks reduces the time and cost associated with manual inspection and improves the accuracy of collected data, allowing for better resource management and planning of future interventions [4].

In [58], the integration of real-time traffic sign detection techniques with UAVs to improve road safety systems is addressed. UAVs equipped with advanced cameras and sensors can detect defective or damaged traffic signs and send automatic alerts to competent authorities for rapid repair [4,17,59]. This proactive approach to road maintenance significantly improves driver safety by reducing response time to detected signage problems.

In addition to road sign detection [3,4,17,59], UAVs have also been used to monitor traffic behavior and detect potential safety risks. In [60], UAVs are used to reconstruct real-time traffic flow using cameras mounted on UAVs. This approach allows for analyzing vehicle behavior in complex situations, such as congested intersections or high-speed areas, providing valuable data for road safety planning and improvement [59].

Using UAVs for sign detection and road safety improvement has proven to be an effective tool in transportation infrastructure management [17]. The ability of UAVs to capture high-resolution images and their integration with AI techniques has enabled automated sign detection and real-time traffic monitoring, contributing to more excellent safety and efficiency in road management [21].

3.3. Automated Inventory Systems in Road Infrastructure

The maintenance and optimization of road infrastructure is crucial for ensuring road safety [61]. Traditionally, inspection and inventory of elements such as signage and pavement have been carried out manually, a process that is costly, time-consuming, and prone to human error [4]. With the introduction of intelligent transportation systems (ITS), automated inventory opens up new opportunities to improve accuracy and efficiency in these tasks. Automation not only allows for more frequent and detailed inspections but also reduces workers’ exposure to hazardous conditions on busy roads [3].

The development of inventory systems that combine automatic signage recognition technologies with geographic information systems (GIS) represents a notable advance in road infrastructure management [62]. By integrating signage element location data into a GIS database, it is possible to create an accurate and georeferenced record of critical infrastructure elements [63]. These systems also allow for the detection of changes and potential damage to signage, facilitating preventive maintenance and improving road safety through early warnings [25], as shown in Figure 5.

However, the adoption of automated inventory systems still faces certain limitations. At the technological level, many current systems require additional development to adapt to the variability of road environments, such as changes in lighting and adverse weather conditions [64]. Despite these challenges, the potential of automated inventory systems in road infrastructure monitoring is undeniable, and their integration into road management could significantly reduce operational costs and improve road safety.

3.4. Detection Technologies

Advances in deep learning algorithms have taken signage detection to a new level of precision. Models such as YOLOv4 and Faster R-CNN have shown exceptional capabilities for identifying traffic signs in real-time in complex and varied environments, especially in urban areas [1,65] with high object density [65]. These models detect signs in a dense environment and have also demonstrated the ability to classify signs of different sizes and orientations [65,66].

The incorporation of techniques such as attention mechanisms in YOLOv4 has improved the detection of small and covered objects, critical aspects in road signage monitoring, where objects may be partially blocked or in unfavorable positions [66,67]. A comparison between YOLOv4 and Faster R-CNN in urban and rural conditions showed that YOLOv4 outperformed Faster R-CNN in detection accuracy by 5% in urban environments, while Faster R-CNN obtained better results in areas with lower object density [68,69,70]. This suggests that both models present unique advantages depending on the operational context, which should be considered when designing real-time monitoring systems for road signage. Table 2 and Table 3 present the precision and recall results of these models under different conditions, highlighting performance variations according to the environment [1,4].

Despite their high accuracy, implementing these models in UAVs for signage detection faces several technical challenges. One of the main obstacles is the high computational cost, as real-time processing of high-resolution images [17,21] requires specialized and optimized hardware [71]. This aspect poses a limitation for the widespread adoption of these systems, especially in low-budget applications. However, recent research has proposed optimization techniques in neural networks to reduce computational weight and improve efficiency in signage detection in complex urban environments.

In addition to visual detection techniques discussed in this review, recent advancements such as SignParser and SignEye leverage natural language processing to interpret traffic signs. These approaches offer innovative perspectives by combining semantic understanding with visual data, representing promising complementary methods to improve automated road signage inventory systems. Although these methods differ from the visual detection technologies analyzed here, they highlight future directions for integrating multimodal approaches in traffic sign management.

3.5. Surface and Object Detection in Urban Applications

The detection of surfaces and objects in urban environments has advanced significantly thanks to the combination of technologies such as LiDAR [11,12,13], RGB cameras, and deep learning algorithms. These technologies allow the generation of detailed and accurate three-dimensional models, fundamental for urban planning, infrastructure detection, and monitoring of densely populated areas [1,65]. Multispectral data fusion [14] has proven especially useful in overcoming the inherent challenges of mapping and analyzing urban environments, such as variability in lighting conditions and obstacles. One of the key approaches in urban surface detection [1] is the fusion of optical images with LiDAR data. In [72], a study on the classification of urban surfaces through the integration of images captured by UAVs and LiDAR data is presented. The results show that combining these data improves accuracy in classifying surfaces such as roads, buildings, and green areas, allowing a more detailed view of the urban environment. This ability to fuse data from multiple sensors [35,36,37] has improved the quality of urban maps and infrastructure planning.

In change detection in urban surfaces, techniques based on point clouds have also been widely used [73,74,75,76]. In [77], the use of point clouds generated by LiDAR to detect changes in urban environments, such as road erosion or infrastructure wear, is explored. Table 4 provides a detailed comparison of UAVs detection technologies, outlining their operating principles, advantages, and disadvantages. This approach allows for the monitoring of surface deterioration and planning repairs more effectively, contributing to the preventive maintenance of road infrastructure. Additionally, the ability to detect changes in the urban environment from three-dimensional data provides a valuable tool for infrastructure management [47]. Another important application of object detection in urban environments is the identification of obstacles on public roads, a crucial task for improving road safety. In [77], a system based on LiDAR and optical cameras for detecting objects on urban roads, such as traffic signs, pedestrians, and vehicles, is presented. This approach has proven highly effective in the real-time detection of road obstacles, alerting authorities to potential hazards and facilitating decision-making to improve traffic safety.

Furthermore, the creation of automated urban infrastructure inventories has been optimized by using advanced object detection technologies [78]. In [79,80], a system for detecting urban surfaces is described for generating inventories of elements such as traffic signs, bus shelters, and urban furniture. These automated inventories are essential for efficient infrastructure management, as they allow for constant updating of urban assets and optimize resources allocated to maintenance and repair. In summary, surface and object detection advances in urban applications have significantly improved infrastructure management and urban planning [1,65]. The fusion of data from multiple sensors, such as LiDAR and optical cameras, along with deep learning techniques [39,40,81,82,83,84], has enabled greater precision in the identification and monitoring of urban elements, contributing to road safety and the sustainability of the urban environment.

Automated urban infrastructure inventories have significantly advanced through the integration of technologies such as UAVs, LiDAR sensors, and AI-based object detection algorithms. These inventories enable the efficient cataloging and monitoring of urban elements, including traffic signs, bus shelters, and streetlights, ensuring real-time updates of urban assets. For instance, Ref. [72] demonstrated that combining UAV-based multisensor data fusion with deep learning techniques enhances urban land cover mapping, providing critical data for infrastructure planning and environmental management. Similarly, Ref. [13] highlighted how automated detection systems improve reliability in monitoring infrastructure while reducing inspection times. By integrating data from LiDAR sensors and optical cameras with deep learning algorithms, these inventories achieve high precision even under challenging conditions such as poor lighting or occlusions caused by vegetation. These advancements contribute to road safety and sustainability by enabling data-driven decisions that optimize resource use while minimizing waste.

Table 4. Comparison of different UAVs detection technologies [85].

Detection Technique	Operating Principle	Advantages	Disadvantages
Radar-based	Utilizes radio waves to detect and locate nearby objects	Long-range capability All-weather performance Ability to recognize micro-Doppler signatures (MDS); velocity and direction measurement	Limited detection capability due to low radar cross-section (RCS) Performance limitations due to low altitudes and velocities High cost and deployment complexity
Radio Frequency-based	Captures wireless signals to detect UAVs radio frequency signatures	Long-range detection and identification All-weather resilience Capability to capture UAVs and operator communication signals and spectra Ability to distinguish different UAVs types	Unable to identify UAVs Interference from other radio frequency sources Vulnerable to hacking
Acoustic-based	Detects UAVs by their unique sound signatures	Cost-effective Non-line-of-sight (NLOS) capability Rapid deployment	Background noise interference Limited detection range Susceptibility to wind conditions
Vision-based	Captures visual data of the UAVs using camera sensors	Visual confirmation Non-intrusive Cost-effective	Limited detection range and requires line-of-sight (LOS) Dependence on weather and lighting conditions

3.6. Advances in Small and Hidden Object Detection

The detection of small and hidden objects in complex scenarios remains a significant challenge for computer vision and deep learning systems [17,18,19,20]. This task is crucial for road safety and precision agriculture applications. Recent advances in deep learning algorithms [81,82,83,84] and multisensor integration [35,36,37] have substantially improved real-time detection capabilities in challenging environments.

CNN models optimized for small object detection [25,29,72] have shown promising results. Networks like YOLOv4 and Faster R-CNN have been adapted to handle complex urban traffic and infrastructure inspection scenarios. The accuracy of small object detection has increased significantly with the inclusion of attention mechanisms, enhancing the networks’ ability to focus on relevant image areas [86]. In addition, scenarios where objects may be partially hidden, such as in autonomous driving systems, detection becomes more complex. DNNs and segmentation techniques have been employed to improve accuracy, with LiDAR sensors combined with RGB cameras enhancing detection capabilities in densely populated urban environments [10,11,12,13,32,87].

The development of specialized datasets, such as the Manipal-UAVs Person Detection Dataset [88], has been crucial for training and evaluating these models in real-world scenarios. This dataset is designed for person detection in images captured by UAVs, focusing on identifying small objects like people viewed from considerable heights. Regarding practical applications, small object detection has been implemented in anti-UAVs systems [89,90], where precise identification of small aerial objects is crucial for security. An improved model based on YOLOv7 [89,90,91,92] uses an attention mechanism to detect and classify multiple aerial targets in real-time. This approach is vital for security and defense applications, where rapid and accurate detection of small UAVs can prevent potential incidents.

These advancements, driven by improved algorithms, multispectral sensors [14], and specialized datasets, enable more accurate real-time detection of previously challenging objects. This progress has significantly expanded the potential applications of detection systems, opening new possibilities in fields such as security, agriculture, and urban management, where precision and efficiency are increasingly essential [1,65].

3.7. Semantic Segmentation for Object Detection

Semantic segmentation is a fundamental technique in image processing that allows for the identification and classification of objects within a scene by associating each pixel of an image with a specific class [93]. This technique has gained relevance in various applications, such as object detection in urban, agricultural, and road infrastructure environments, improving the accuracy and efficiency of automated systems [94]. A key aspect of semantic segmentation is its ability to process large volumes of data, making it an essential tool for identifying multiple objects simultaneously. In [19], the most advanced deep learning techniques applied to semantic segmentation of images and videos are reviewed, highlighting the advances achieved with CNN architectures and transformer-based networks. These techniques have allowed for more precise segmentation in images of high complexity and object density, such as in traffic scenarios or infrastructure monitoring [95], as shown in Figure 6.

Particularly in object detection in road environments, semantic segmentation has been successfully applied to improve the identification of traffic signs and other elements related to road safety. In [72], a system is presented that uses CNNs for the joint segmentation of road objects and lanes in traffic images. This approach allows for the simultaneous identification of traffic signs and lane lines, essential for autonomous driving systems. The use of CNNs has proven to be highly effective in classifying and segmenting these objects in real-time, improving the accuracy of autonomous vehicles in decision-making [25,29]. In addition to its application in road environments, semantic segmentation has proven helpful in classifying and detecting urban surfaces [1] through the fusion of data from multiple sensors [35,36,37]. In [22], advances in multisensory data fusion, such as LiDAR and optical cameras, are explored to improve accuracy in urban surface segmentation [1]. This approach allows for detecting hidden or partially covered objects, which is crucial for urban and rural monitoring applications. The combination of RGB images with depth data obtained by LiDAR has allowed for more precise segmentation of small objects, such as traffic signs or vegetation, which would be difficult to detect with conventional techniques.

In [22], the use of real-time semantic segmentation applied to traffic flow reconstruction in cities is addressed using lightweight cameras mounted on edge devices. This approach is particularly relevant for object detection in dynamic scenarios, such as vehicular traffic, where semantic segmentation must operate with high levels of precision and in real-time. The authors demonstrate that integrating segmentation techniques with edge devices allows for efficient reconstruction of vehicular flow without relying on costly infrastructures. Semantic segmentation has revolutionized how objects are detected and classified in various applications. Its ability to process large volumes of data in real-time and to fuse information from multiple sensors [35,36,37] has significantly improved the accuracy of detection systems in complex environments, such as cities, roads, and agricultural fields.

3.8. Multisensor Data Fusion

Multisensor data fusion [96] has emerged as a key technique for improving the accuracy and reliability of detection and navigation systems, particularly in applications where the combination of multiple data sources can offer a more robust and detailed view of the environment. The integration of sensors such as RGB cameras and multispectral sensors has enabled significant advances in autonomous navigation and urban infrastructure monitoring [91], as shown in Figure 7.

In [72], the fusion of multisensor data for urban land cover classification using UAVs is explored, combining optical images and LiDAR data [1,11,12,13,78]. The combination of different types of data aids in overcoming the inherent limitations of each sensor, such as the sensitivity of optical cameras to light conditions or the difficulty of detecting certain materials [11,12,13] with LiDAR sensors. The study demonstrates that the use of deep neural networks [97,98,99] to fuse these data significantly improves classification accuracy, especially in densely populated urban areas [1], where shadows and buildings can affect the quality of captured data.

The use of UAVs with multispectral sensors and LiDAR has also enabled advances in autonomous navigation, improving vehicles’ ability to detect and avoid obstacles in real-time [14]. In [5], an object detection and segmentation system using UAVs is presented, which utilizes LiDAR and RGB camera data to identify obstacles in challenging environments [10]. This approach allows UAVs to navigate autonomously in complex areas, such as densely populated cities or areas with dense vegetation [100]. Fusing data from multiple sensors enhances the system’s ability to identify hidden or partially covered objects, which is essential for safe real-time navigation [35,36,37].

Another key application field for multisensor data fusion is the creation of three-dimensional urban models, where the integration of optical images and LiDAR data has proven particularly effective [11,12,13]. In [101], a data fusion method is proposed that uses UAVs for object detection and 3D urban map generation. This approach allows for creating accurate representations of the urban environment, which is helpful for urban planning [1], infrastructure management, and autonomous navigation of ground and aerial vehicles [3]. The study demonstrates that the fusion of data from multiple sensors provides a more detailed view of the environment than using a single type of sensor, improving the quality of the generated maps. Multisensor data fusion has transformed the way UAVs and autonomous navigation systems detect and process environmental information.

3.9. Data Fusion and Automatic Registration in Urban Applications

Data fusion and automatic point cloud registration are essential techniques for urban infrastructure detection and analysis, which have improved accuracy and automation in generating three-dimensional models of urban areas [73,74,75,76]. These techniques allow for the integration of multiple types of data, such as LiDAR and optical data, to obtain a more detailed and accurate view of the urban environment, facilitating urban planning, infrastructure management, and monitoring of changes in the urban landscape [102].

A notable approach is using automatic methods to optimize building classification in urban environments. In [31], a weak sample optimization method for building classification is presented using a semi-supervised deep learning framework. This method allows for the efficient classification of structures in areas where labeled data is limited, improving the accuracy of classification models by using unlabeled data in training. This approach is particularly useful in urban planning, where accurate identification of buildings and other urban objects is crucial for efficient infrastructure management.

Furthermore, data fusion and joint registration of mobile LiDAR point clouds [102,103,104] in repeated areas has been a central topic in improving urban mapping. In [105], an automated multi-constraint joint registration method for mobile point clouds captured in repeated areas is proposed. This method optimizes the alignment of multiple LiDAR scans to generate a coherent map of the urban area, allowing for the detection of environmental changes and facilitating the updating of urban models over time. This is especially useful in applications such as infrastructure monitoring and detecting changes in buildings or roads [104].

The evaluation of LiDAR and photogrammetry data fusion on various surfaces has proven effective in creating detailed urban maps and detecting objects in densely populated environments [102,103,104]. Although LiDAR and photogrammetry primarily capture surface-level features, their combination significantly enhances the accuracy and precision of surface modeling, enabling the identification of subtle surface indicators associated with underground infrastructure. For instance, LiDAR sensors capture precise elevation data and generate high-resolution 3D point clouds that can reveal subtle ground surface variations indicative of buried elements, such as slight depressions or elevations caused by underground utilities or infrastructure [106]. Photogrammetry complements this by providing high-resolution imagery capable of detecting visual cues on the surface, such as road markings, manhole covers, utility access points, or repaired pavement areas typically associated with underground services [106]. By integrating these two technologies, it becomes possible to infer the presence and location of underground infrastructure indirectly through surface-level indicators, significantly improving accuracy compared to using either method individually [104,106]. These combined results are particularly valuable for generating high-precision 3D models that support urban monitoring, infrastructure planning, and preliminary identification of potential locations of underground services prior to more invasive inspection methods [104,106].

Another study that significantly contributes to data fusion in urban applications is that of [107], where an object-based framework for classifying point clouds obtained through airborne LiDAR systems such as airborne laser scanning (ALS) in urban areas is introduced [105]. This technique allows for the classification of different elements of the urban environment, such as buildings, trees, and roads, through the fusion of data obtained from multiple viewpoints [104]. Object-based classification improves the ability to handle large volumes of data and allows for more precise segmentation of urban components [1].

Similarly, data fusion for detecting changes in the urban landscape has enabled advances in urban area management and planning [1]. In [108], a methodology for detecting changes in urban areas through the fusion of images and LiDAR point clouds is proposed [102,103,104], facilitating the identification of modifications in urban topography and infrastructure. This approach is crucial for managing urban renewal projects and detecting changes in buildings and other urban elements [94].

Data fusion, automatic registration, and classification methods have revolutionized how urban models are generated and updated [1]. The use of advanced deep learning techniques and the combination of LiDAR data [102,103,104] with optical images has allowed for greater accuracy in change detection and the classification of urban infrastructure, significantly improving urban planning and infrastructure management [1,94].

4. Quantitative Results

The systematic review enabled a quantitative analysis of the selected studies, focusing on the most relevant performance metrics for the automated detection of road signage using UAVs and AI algorithms. The quantitative results focused on evaluating the performance of DLMs in terms of accuracy, recall, and F1-score, fundamental metrics for understanding the effectiveness of the systems in different operational environments.

4.1. Performance of Detection Algorithms

The analyzed studies showed varied performance of detection algorithms depending on the complexity of the environment and the quality of the captured images. Models based on YOLOv4 [18,23,24] and Faster R-CNN [25,26,27] proved to be the most robust, achieving an average accuracy of 92% in urban environments and 88% in rural environments [18,98]. The difference in results is attributed to variations in lighting, sign size, and environmental conditions. As summarized in Table 5, YOLOv4, in particular, stood out for its ability to detect small objects with a recall of 85%, thanks to the incorporation of attention mechanisms [75].

4.2. Comparison Between Segmentation Techniques

The segmentation techniques employed in the reviewed studies show differences in accuracy and efficiency depending on the type of sensor and AI algorithm. In 3D urban modeling with LiDAR and CNNs, high accuracy is achieved in point clouds [48], while in bridge infrastructures, semantic segmentation proves precise but requires high computational resources [116]. Techniques such as CN4SRSS and DeepLab v3+ enhance segmentation in low-resolution images [54], and Bayesian deep learning optimizes uncertainty management in complex point clouds, though it remains computationally intensive [116].

With recent advancements in traffic sign detection, models like YOLOv8 and YOLO-NAS have achieved a mAP of 95.72%, surpassing previous models in accuracy. This improvement is critical for road safety and real-time traffic monitoring applications [116]. Thus, the combination of multispectral sensors [14] and advanced segmentation algorithms, such as those mentioned, strengthens the precision and reliability of automated systems across multiple environments, as summarized in Table 6.

4.3. Identified Quantitative Challenges

Despite the high accuracy achieved in the detection and segmentation of infrastructures, quantitative challenges persist that affect the effectiveness of these systems. Among them, the high computational cost associated with processing large volumes of real-time data remains a barrier, requiring specialized hardware such as advanced graphics processing units (GPUs), which increases operational costs [117]. Multisensor systems, such as UAVs equipped with LiDAR and RGB cameras, generate a substantial amount of data, which demands robust processing capacity, especially in complex models like YOLOv4 [117]. Table 7 outlines the key quantitative challenges associated with UAVs usage and data processing.

Additionally, adverse environmental conditions, such as variability in lighting, fog, or rain, significantly impact data quality and the accuracy of segmentation and detection models, with up to a 15% reduction in accuracy compared to ideal conditions [48]. This issue is particularly notable in dense urban areas and during weather events, where LiDAR scanning quality and image capture are affected [116].

The lack of uniform standards for algorithm evaluation complicates the comparison of results across different studies and environments, limiting the replicability of systems in real-world applications [116]. Furthermore, the success of AI models heavily depends on large and diverse training datasets, whose availability is limited, hindering the generalization and scalability of these systems [49]. In this context, real-time data synchronization poses an additional challenge in multisensor systems, as it requires significant resources to ensure efficient integration across multiple data sources [118].

4.4. Data Extraction from Relevant Studies

In this research, the selected articles were classified into nine main categories based on the recurring themes and technological approaches observed in the systematic review. Each theme was chosen for its significance in improving the methods and technologies applied to the automated management of road signage using UAVs and AI. Together, these areas encompass the technical, operational, and integration components that ensure an effective and accurate system for monitoring traffic infrastructures.

Figure 8 documents the evolution of research integrating UAVs and AI in the detection and classification of road signage [3], organizing advancements into nine key topics. This classification highlights predominant themes in the literature, allowing for an observation of growth trends and improvements over the years. The topics, ranging from algorithm accuracy in varied environments to the incorporation of multisensor systems [96] and advanced segmentation techniques [19,94,95], provide a comprehensive analysis of the most effective techniques and current challenges in automating signage inventories. Notably, the recent adoption of DLMs such as YOLOv4 [18,23,24] and Faster R-CNN [25,26,27] is emphasized as fundamental for precise real-time traffic sign detection. Furthermore, Figure 8 illustrates how the combination of multisensor data [96] and information fusion has enhanced the reliability of these systems, particularly under adverse environmental conditions and with diverse hardware configurations. This development reflects the consolidation of a growing field of study with direct implications for safety and efficiency in road infrastructure management. Thus, the classification of topics not only underscores contributions from the literature but also points to future directions in this critical area of research.

4.4.1. Application of DLMs in Traffic Sign Detection and Classification

DLMs, such as YOLO [18,23,24,104,120] and Faster R-CNN [25,26,27,99,121,122], have revolutionized the detection and classification of road signage in images captured by UAVs. These techniques enable the precise identification of traffic signs and other road elements, even in dense and visually complex environments [87]. Their adaptability to various lighting conditions and environmental variability makes them particularly suitable for road signage monitoring and management [115]. Additionally, these algorithms process large volumes of data in real time, which is essential for applications requiring continuous and accurate monitoring of road infrastructure. The effectiveness of DLMs has surpassed conventional methods, reducing inspection costs and times while improving coverage and update frequency for signage inventories [25,39,81,82,83,84]. When applied to UAVs, these models facilitate proactive infrastructure management by optimizing maintenance and road safety through timely detection and documentation of changes in signage.

Table 8 summarizes the main algorithms used for detecting and classifying road signage, highlighting models such as YOLO [18,23,24,104,120] and Faster R-CNN [25,26,27,99,121,122], which are fundamental for the automated analysis of aerial images captured by UAVs. These algorithms combine precision and speed in real-time object identification, making them crucial for creating signage inventories in complex and dynamic environments. Table 8 includes metrics such as precision and mAP [8,41], which are critical for evaluating each algorithm’s performance across varied contexts—from densely populated urban areas to rural zones [123]. These metrics provide a quantitative comparison that helps identify algorithms best suited to different operational conditions. In managing signage inventories, high performance in precision and mAP ensures that signs are detected and classified reliably and quickly—an essential feature for systems requiring periodic and accurate real-time updates.

4.4.2. Segmentation and Processing of Aerial Images

The segmentation of aerial images [96,126] is fundamental for the analysis of road infrastructures, as it allows for distinguishing elements of interest, such as signage, from other objects and structures in the environment. By applying segmentation algorithms to images captured by UAVs, the identification and classification of traffic signs are facilitated, especially in urban environments with high visual density [120]. Semantic segmentation, which assigns a label to each pixel in the image, is particularly useful for differentiating signage elements from the rest of the infrastructure, increasing accuracy in automated road monitoring [19,94,95]. Advanced processing of aerial images through enhancement and quality adjustment techniques optimizes segmentation results by adapting models to variable lighting and contrast conditions [115]. This is crucial for achieving reliable identification at different times of the day and under various environmental conditions [29]. Accurate segmentation not only improves the detection accuracy of signage but also enables more detailed and frequent monitoring of road infrastructure, facilitating maintenance planning and ensuring a quick response to potential damages or changes in signage [127].

Table 9 presents advanced segmentation techniques that are essential for automated signage management, especially in complex urban areas with visual elements that could interfere with detection, such as vegetation, buildings, or vehicles [31,100,127]. This segmentation capability is particularly relevant in environments with high visual interference, improving accuracy in signage inventories by reducing classification errors. Additionally, Table 9 details precision and mAP metrics [8,41] for each segmentation technique, providing an effectiveness criterion for isolating key elements in the road environment. These metrics allow researchers to identify the most suitable technique for contexts with high visual variability. By offering a quantitative comparison, Table 9 facilitates the selection of techniques that enhance accuracy in signage inventories, enabling more detailed and reliable updates of detected road elements. This is essential for automated monitoring and maintenance planning of infrastructures [32,73].

4.4.3. Integration of Sensors and Multisensor Systems

The integration of multiple sensors [35,36,37], such as RGB cameras [10], LiDAR [11,12,13], and multispectral cameras [14], provides UAVs with a more comprehensive and precise view of the road environment, significantly improving the detection and classification of road signage [17,65]. Each sensor type contributes unique information: RGB cameras [10] capture details of color and shape, LiDAR sensors [11,12,13] provide depth data, and multispectral cameras [14] facilitate detection under low-visibility conditions. This combination of data results in more robust and accurate detection, particularly in environments where environmental conditions can affect the performance of a single sensor [42,52]. Additionally, multisensor systems [35,36,37,38] enable the creation of high-precision 3D maps and real-time segmentation, essential for maintaining an updated and accurate inventory of road signage [4,17]. The integration of different sensors minimizes detection errors and optimizes UAVs capabilities to operate in diverse contexts, adapting to variations in lighting, weather, and topography [129]. This synergy makes UAVs versatile tools for infrastructure inspection, enhancing safety and efficiency in their management.

Table 10 summarizes studies employing multisensor approaches [35,36,37,38] to improve accuracy and reliability in data capture [117], fundamental aspects for ensuring the quality of road signage inventories and minimizing detection errors under various conditions. The metrics presented, such as precision and mAP [8,41], illustrate how sensor integration enhances the performance of UAV systems in signage detection. These data are crucial for understanding the advantages of multisensor integration [35,36,37,38], enabling the overcoming of limitations inherent to individual sensors. The information contained in Table 10 helps researchers and professionals select more robust and precise sensor configurations, thereby optimizing the quality and reliability of road signage inventories in areas with high environmental variability [129].

4.4.4. SLAM (Simultaneous Localization and Mapping)

The SLAM technique is fundamental for the autonomous navigation of UAVs in environments where the global positioning system (GPS) is unreliable, such as dense urban areas or indoor spaces [133,134,135,136,137]. SLAM enables UAVs to generate detailed maps in real-time while localizing themselves within these maps, which is essential for detecting and monitoring signage in complex infrastructures [134]. This technique ensures that the UAV maintains a precise trajectory and avoids obstacles while collecting data on the condition of road signage [87]. The application of SLAM facilitates the accurate geolocation of infrastructure elements, enabling the creation of detailed and updated inventories [11,138]. Maps generated through SLAM enhance precision in identifying and documenting traffic signs and other critical elements, supporting maintenance planning and the implementation of improvements in road infrastructures [3]. The ability to operate without GPS expands the reach of UAVs, allowing for more comprehensive inspections in hard-to-access areas [61].

Table 11 compiles a selection of studies on the use of SLAM in applications with UAVs and other autonomous systems, utilizing technologies such as LiDAR [11,12,13],

360^{\circ}

cameras [11], and particle maps [138], among others. Each technique is evaluated based on key metrics such as mAP and precision [8,41], highlighting achievements such as 90.8–93.2% accuracy in 3D inspection using LiDAR and deep learning. The main challenges include high computational costs and the need for calibration in complex conditions [134,139]. Table 11 provides a comprehensive overview of the advantages, limitations, and applications of each technique, offering a valuable reference for researchers in the field of mapping and localization in complex environments. The evaluation of these metrics helps one to understand how SLAM contributes to improving accuracy and efficiency in detecting and geolocating signage, which is essential for real-time monitoring applications requiring precise geospatial data, especially in areas with limited GPS access.

4.4.5. Object Detection Techniques Using UAVs

Object detection techniques using UAVs have significantly advanced thanks to the use of computer vision algorithms [16,17,18,19,20], enabling the rapid and accurate identification and classification of traffic signs and other road infrastructure elements. UAVs capture high-resolution aerial images [17,21], processed in real time by AI models, facilitating object detection in both dense urban areas and rural roads [1,133,134,135,136,137]. These techniques are essential for maintaining detailed and up-to-date road signage inventories, enabling proactive infrastructure management. The ability of UAVs to identify moving objects and operate under diverse environmental conditions makes them versatile and efficient tools for continuous signage monitoring [42,52]. The high precision of these models reduces the risk of errors in sign identification and enables frequent, uninterrupted inspections, representing a significant improvement over manual methods by optimizing resources and enhancing road safety.

Table 12 summarizes key studies on object detection using UAVs, detailing advanced techniques along with their respective advantages and challenges. Methods include algorithms such as Faster R-CNN [18,23,24,104,120] with InceptionResnetV2 [17] and MOD-YOLO [14,147], among others, designed to address detection challenges in various environments. Notable metrics include mAP values [41] and precision rates such as 98.74% for CeDiRNet [93] and 94.3% for a modified version of MaskRCNN [147]. Challenges include computational complexity and sensitivity to small objects, highlighting both the opportunities and limitations of UAV use in this field. Additionally, the table illustrates how each technique aligns with specific needs for detecting objects in aerial images. For instance, MSA-CenterNet [73] with ResNet50 excels in detecting small objects and scale variations, achieving an accuracy of 89.2%, while CSP with YOLOv4+ [18] demonstrates strong results in precision and speed for multi-scale detection [3]. Each technique addresses unique challenges such as occlusion and class imbalance, emphasizing the importance of adapting algorithms to the environment and object type. Table 12 provides metrics like mAP and precision [8,41], enabling researchers to compare the performance of different models in signage detection. These data are crucial for evaluating which algorithms best adapt to environmental variability, ensuring precise and efficient sign identification in urban and rural areas [1,134,135,136,137]. The comparative information facilitates the selection of models that optimize inventory capture and updates, reducing the risk of missing signage and improving road infrastructure management.

4.4.6. Performance Evaluation of Detection and Classification Models

Evaluating the performance of detection and classification models for road signage is essential to selecting appropriate technologies for monitoring road infrastructures [150]. Metrics such as precision, mAP [8,41], and F1-score [25] are fundamental for measuring the effectiveness of these models across different environments and conditions. A rigorous analysis helps identify the most reliable models, optimizing available resources and improving the quality of automated monitoring [151]. These indicators guide the selection of efficient and suitable algorithms for each context, ensuring accurate detection and reliable classification of traffic signs, and they contribute to the development of monitoring systems that meet the safety and maintenance demands of road infrastructures [47].

Table 13 synthesizes key studies on defect detection models in infrastructure using vision techniques [16,17,18,19,20] and sensors such as LiDAR [11,12,13]. For instance, the YOLOv5l model [150], with rule-based post-processing, achieved an mAP [41] of 59% and a precision [8] of 61%, though it has limitations in detecting small defects. Conversely, UAV systems with LiDAR [11,12,13] and optimized flight planning achieve accuracies between 88.9% and 92.4%, and when integrated with RGB cameras [10], they reach accuracies of 95–99% in detecting damage to structures such as dikes. Each technique presents specific limitations: The YOLOv5l model [150] is less effective at detecting small defects, while LiDAR-based methods [11,12,13] are sensitive to adverse weather conditions. These studies underscore the importance of adapting models to environmental characteristics and infrastructure analysis needs, demonstrating that precision metrics [8] are essential for evaluating system effectiveness and reliability. Table 13 also provides a better understanding of each model’s limitations and strengths, facilitating decisions for implementing automated monitoring systems that optimize road traffic safety and efficiency [116].

4.4.7. Challenges and Recent Advances in Photogrammetry and 3D Mapping

Photogrammetry [154] and 3D mapping [146] are techniques that enable the creation of detailed representations of road infrastructures, facilitating the planning and management of signage maintenance. Using UAVs, aerial images are captured and then processed to generate three-dimensional models of roads and traffic signs, providing a detailed view of the infrastructure’s condition and aiding in damage identification and preventive maintenance measures [155]. Recent advancements in these techniques have enhanced the accuracy of 3D models, thereby optimizing road infrastructure monitoring [13,33]. Although technical and data processing challenges persist, the use of photogrammetry and 3D mapping with UAVs has become an essential tool in infrastructure management, enabling continuous and real-time monitoring of road elements [156].

Table 14 presents key scientific studies exploring the use of UAVs to capture three-dimensional images and generate accurate models of roads and signs [13,33]. This 3D reconstruction capability allows for effective planning and management of signage inventories, proving particularly useful for infrastructure maintenance and upgrade projects [62,80]. Table 14 includes precision metrics for 3D reconstruction, providing a foundation for evaluating the effectiveness of these techniques in capturing fine details of road infrastructure [55,75]. Additionally, the studies compiled in Table 14 examine the challenges and recent advancements in using photogrammetry and 3D mapping within the context of road infrastructure [55,80,146,154,155]. The incorporation of UAVs has improved speed and reduced data capture costs compared to traditional methods while also increasing access to hard-to-reach areas [125]. The included articles detail employed techniques and specific challenges, such as the need for advanced processing algorithms and addressing precision issues under variable lighting and weather conditions.

Each article in Table 14 provides performance metrics for photogrammetry and 3D mapping models, such as accuracy, spatial resolution, and processing times, which are crucial for evaluating the feasibility of these techniques in practical applications [55,80,146,154,155]. Some studies also innovate by incorporating multispectral sensors [14] and LiDAR [11,12,13], which enhance the quality of 3D models by capturing depth data and environmental details. Collectively, this information allows researchers and professionals to understand current capabilities and identify necessary advancements in photogrammetry and 3D mapping, contributing to continuous improvements in road infrastructure monitoring and management.

In addition to the detection and monitoring of road signage, comprehensive assessment of pavement deterioration represents a critical component of urban infrastructure management. Pavement degradation includes not only crack formation but also potholes, surface raveling, structural deformations (such as rutting or load-induced depressions), surface wear, and other defects that significantly impact road safety and driving comfort [109]. According to [157], these pavement defects present considerable safety hazards, particularly in urban environments where heavy traffic accelerates pavement deterioration.

Recent investigations have explored advanced technologies for the assessment of pavement deterioration using UAVs equipped with LiDAR sensors and high-resolution optical cameras. Ref. [157] proposed a UAV-based method integrating LiDAR data with photogrammetric techniques to accurately identify and quantify pavement surface defects such as potholes, raveling, and structural deformation. This method efficiently generates detailed three-dimensional models from point clouds, allowing for precise identification of pavement surface defects and providing quantitative metrics for objective evaluation.

In dense urban areas characterized by intense traffic conditions that accelerate pavement degradation, UAV-based technologies offer substantial advantages by enabling frequent and detailed inspections without disrupting traffic flow or endangering inspection personnel [109]. The capability of generating high-resolution 3D models through aerial photogrammetry facilitates the early identification of critical areas that require immediate intervention [109]. This proactive approach contributes to more efficient preventive maintenance planning and reduces operational costs associated with delayed corrective actions.

Furthermore, these advanced techniques facilitate the integration of collected data into GIS, enabling proactive infrastructure management based on continuously updated information [109,110]. This integration not only improves preventive urban maintenance planning but also directly contributes to improved road safety by anticipating structural failures before they pose significant risks to users [110]. Recent studies have highlighted how these technologies can be effectively incorporated into intelligent urban management platforms, supporting informed decision-making processes based on continuously updated data streams [109,110].

4.4.8. Aerial Image Segmentation

The segmentation of aerial images is essential for distinguishing signage elements from other objects in the road environment [21]. Equipped with advanced algorithms [16,17,18,19,20], UAVs can identify and isolate traffic signs in areas with high visual density, such as urban zones, enabling more accurate detection and reducing errors in signage classification [1,51,133,134,135,136,137]. This level of segmentation is fundamental for the automated inventory of infrastructures, as it facilitates a detailed understanding and more effective management of road elements. Moreover, by improving the reliability of monitoring systems, precise segmentation enables proactive maintenance of traffic infrastructures and contributes to road safety.

Table 15 presents an analysis of an advanced aerial image segmentation technique [158], evaluated based on its advantages, challenges, and performance metrics, such as overall accuracy. This hybrid technique combines segmentation based on the canopy height model (CHM) with a point-based clustering approach, complemented by a multiscale adaptive filter and supervoxel-weighted fuzzy clustering. Compared to conventional methods, this technique provides more accurate segmentation of individual elements, such as trees, and enhances performance and efficiency in road infrastructure applications, where precise identification of visual objects is key for reliable monitoring. Table 15 also highlights the challenges associated with each technique, such as the need to adjust parameters depending on environmental characteristics and object density, which can influence accuracy in certain contexts. For instance, the hybrid method demonstrates an accuracy of 88.27% in segmentation, showcasing its effectiveness in specific applications while emphasizing the importance of parameter adjustments in variable environments. This information allows researchers to better understand the capabilities and limitations of each segmentation technique and select the optimal approach for monitoring and managing road infrastructures with UAVs.

4.4.9. Object Detection with UAVs and LiDAR

The integration of images captured by UAVs with LiDAR data enables robust and accurate detection in road signage monitoring [159,160]. Thanks to the depth information provided by LiDAR, detection is improved in areas with obstacles or low-visibility conditions, facilitating detailed mapping of traffic infrastructures and the creation of signage inventories [79,140]. This combination optimizes the accuracy in the classification of road elements, reduces errors, and provides a more comprehensive view of the environment—essential aspects for the planning and management of road infrastructures [156].

Table 16 presents a compilation of studies on advanced object detection techniques in road infrastructures using UAVs and LiDAR systems. Among the highlighted methods is an approach that combines UAVs–LiDAR with a mobile photogrammetric system (MPS) for the precise extraction of geometric parameters of roads, enabling the generation of longitudinal and cross-sectional profiles [109]. This technique stands out for its efficiency and offers a more cost-effective alternative compared to conventional LiDAR systems, although it faces challenges in filtering non-ground points and achieving temporal synchronization between sensors, which are critical to ensuring data capture accuracy. Another study [47] uses a VLP-32C laser scanner and a road marking degradation model based on 3D point cloud intensity, facilitating the generation of degradation maps and enabling the retroreflectivity estimation of markings without additional equipment. This technique is particularly useful for identifying areas with deteriorated signage, providing a clear visualization of infrastructure conditions. However, it presents challenges in segmenting heavily worn markings and reducing noise in measurements, factors that can affect result accuracy. Collectively, the studies in Table 16 highlight both the advantages and the challenges of combining UAVs and LiDAR for object detection in road infrastructure management [159,160]. The described techniques improve monitoring accuracy and efficiency but also emphasize the importance of addressing issues such as sensor synchronization and handling large data volumes—crucial aspects for effective real-time implementation.

5. Discussion

5.1. Comparative Efficiency of AI Algorithms in Road Signage Detection

The results obtained in this study confirm that deep learning algorithms, such as YOLOv4 and Faster R-CNN, are highly effective in the detection and classification of road signage [18,23,24]. These models stand out for their precision and speed, fundamental characteristics for real-time applications where reliability and quick response are critical. Compared to other methods, these algorithms demonstrate superior ability in identifying traffic signs in complex urban environments where visual variability can hinder accuracy. The data presented in Table 8 reinforce this observation, showing that these models maintain high levels of precision and mAP [8,41] even under adverse conditions, a determining factor in achieving reliable and up-to-date road signage inventories [41].

Furthermore, precise segmentation [32,33] of signage using these models significantly reduces the incidence of classification errors, a crucial aspect for monitoring densely urbanized areas [1]. When comparing these results with previous studies, it is evident that the combination of UAVs with advanced AI not only reduces the costs and time required for inspections, but also allows for greater coverage and frequency of update [4,117]. This represents a considerable advantage over manual inspection methods, which are often limited in terms of accuracy and frequency. The study by Lin and Habib. [116] confirms that computer vision algorithms [16,17,18,19] are essential to address the limitations of traditional methods and improve the management of road infrastructures.

While deep learning algorithms such as YOLOv4 and Faster R-CNN have demonstrated high accuracy in detecting and classifying road signage, it is essential to critically assess the factors influencing system performance. Environmental conditions, such as lighting variability, weather, and drone flight autonomy, can significantly affect detection accuracy [161,162]. Moreover, the robustness of AI models when confronted with low-quality or occluded signage images remains a challenge. For example, occlusions caused by vegetation or urban obstacles may reduce the precision of the model [161,162]. Techniques such as data augmentation and attention mechanisms have been proposed to enhance model resilience under these scenarios. Addressing these factors is vital for ensuring reliable performance in real-world applications.

The combination of advanced segmentation techniques and sensors such as LiDAR provides an additional advantage by enabling the accurate detection of small or partially hidden objects [30]. This is particularly useful in areas with a high density of visual elements, such as urban areas with multiple signs and obstacles [29,31]. Segmentation is therefore key to differentiating signage from other objects, improving the reliability and accuracy of automated inventories. This study suggests that the use of algorithms AI along with UAVs is an effective solution to optimize the maintenance and monitoring of infrastructures, especially in complex environments where precision is critical for road safety [163].

The evolution of YOLO algorithms has significantly improved object detection efficiency and real-time performance. Although our study primarily focused on YOLOv4, recent advances, particularly with YOLOv8, demonstrate continuous progress in this field. As highlighted in Table 5, YOLOv8 has shown promising results, achieving higher accuracy and speed in processing images for vehicle detection compared to YOLOv5 [115]. Ref. [68] reported that improved YOLOv8 algorithms enhanced precision in detecting small objects in aerial imagery, with notable improvements in mAP@0.5 and mAP@0.5:0.95 metrics. This advancement is particularly relevant for the detection of traffic signs in complex urban environments. Additionally, YOLOv8 has demonstrated superior performance in aerial vehicle detection, achieving an accuracy of 80.3% and mAP of 79.7% 116, surpassing previous versions in both speed and precision.

While these improvements are significant, it is important to note that research in this field is ongoing. The development of versions beyond YOLOv8 [68,115], such as potential YOLOv9–YOLOv11, may further address efficiency and real-time processing challenges [164,165,166]. The continuous evolution of YOLO algorithms underscores the dynamic nature of object detection research, with each version building upon the strengths of it is predecessors while addressing their limitations. This progression suggests that future iterations will likely offer even more efficient and accurate solutions for real-time object detection in various applications, including traffic sign recognition and infrastructure monitoring.

5.2. Impact and Practical Applications of UAV and Advanced Sensor Integration

The use of UAVs combined with AI and advanced sensors [35,36,37] has a significant impact on road infrastructure management, enabling more accurate, faster, and safer inspections. Data collection through multispectral cameras [14] and LiDAR [102,103,104] improves the ability of UAVs to capture critical details of road signs, even under low visibility conditions [14,47]. This level of precision is essential for optimizing signage maintenance planning and reducing operational costs. The data presented in Table 10 demonstrate that the integration of multiple sensors [35,36,37] offers a key advantage over traditional approaches by providing comprehensive coverage and more detailed data in each operation.

The adoption of UAVs for inspection tasks also improves the safety of personnel by eliminating the need to work in potentially hazardous conditions [91]. This is particularly relevant in hard-to-reach areas or high-traffic environments where human intervention poses risks. Previous studies, such as those of [49], have emphasized the importance of robust infrastructure to handle large volumes of data generated by multisensor UAVs [96]. Although data processing and associated costs can be challenging, the benefits of increased safety and precision outweigh these limitations, justifying the adoption of this technology for road infrastructure monitoring [62,117]. The economic feasibility of these technologies, including cost structures related to drone deployment, data processing, and AI model training, is addressed here to provide a clearer comparison with traditional manual inventory methods.

In addition to the technical advances in automating road signage inventory using UAVs and AI, it is crucial to highlight the economic and operational benefits these systems can offer. Although the reviewed literature emphasizes significant improvements in efficiency and cost reduction, it would be valuable to quantify these economic and operational benefits in the examined studies. For example, automation can significantly reduce the time and resources required for manual inspections, which, in turn, can translate into financial savings and better resource allocation. Recent studies have estimated that UAV applications in similar tasks can generate substantial economic impacts, reducing annual losses by millions of dollars due to their ability to optimize operational processes [167]. Furthermore, improved accuracy in the detection and classification of road signs can contribute to enhanced road safety by ensuring that signs are up-to-date and visible. However, future research should focus on providing more precise estimates of these economic and operational benefits to strengthen the justification for adopting these technologies.

The capability of UAVs to collect data in inaccessible areas expands the scope and frequency of inspections, facilitating a constant and thorough evaluation of the conditions of road signage [61]. Furthermore, the combination of deep neural networks and multispectral cameras [14] significantly reduces the probability of human error [4], increasing the reliability of the results. This study supports the effectiveness of UAV technology in collecting and analyzing high-quality data, helping inform decision making in infrastructure planning and maintenance while promoting the implementation of automated systems for traffic management and road safety [148].

5.3. Future Directions for the Optimization and Adoption of UAVs in Automated Road Management

Given that this study has identified several challenges in the implementation of UAVs and AI techniques for road monitoring [77], it is recommended that future research focus on cost optimization. The acquisition of advanced sensors such as LiDAR [102,103,104] and multispectral cameras is expensive [14], which limits their large-scale adoption [62]. Exploring more economical sensor alternatives that maintain adequate levels of precision facilitates the mass adoption of these technologies in countries with limited resources. The development of hybrid sensors that combine multiple functionalities on a single platform could be a promising solution to reduce costs and maintain effectiveness in data capture [125].

The standardization of evaluation metrics is another crucial aspect for future research. The lack of common metrics to evaluate algorithm performance makes it difficult to compare results between studies, which limits the reproducibility and adoption of these technologies in the field of road management [3,168]. Establishing standardized protocols for the accuracy and effectiveness of AI models applied to road signage would help improve replicability and facilitate the adoption of these systems by entities responsible for infrastructure [116]. This would also allow future research to be compared more effectively, moving toward a consensus in the evaluation of these models [8,41].

Finally, future research should address the robustness of algorithms under adverse environmental conditions, since climatic and lighting factors can significantly affect the accuracy of AI systems [64]. Developing methods to improve the resilience of algorithms in adverse conditions ensures consistent performance, regardless of context. The integration of deep learning techniques [39,40] with image enhancement algorithms could be an effective solution to mitigate these limitations, allowing for reliable and consistent performance in any environment. Furthermore, it is recommended to investigate real-time data processing techniques, which is critical for the implementation of UAVs in continuous and automated road signage monitoring operations [163].

Integrating UAVs and AI in road signage management enhances operational efficiency. It has significant implications for road safety. By enabling faster and more accurate detection of damaged or missing signs, these technologies reduce accidents caused by inadequate signage, thereby improving overall traffic safety. From a socioeconomic perspective, the automation of signage inventory reduces the costs associated with manual inspections and minimizes human exposure to hazardous environments [169]. However, adopting these technologies also raises important legal and ethical considerations, such as ensuring data privacy during drone operations and addressing accountability in system failures [169]. Future research should further explore these dimensions to ensure that deploying UAVs and AI in road infrastructure management is aligned with social values and regulatory frameworks.

Although this study focuses on a systematic review following the PRISMA methodology, it is important to emphasize that future research should further prioritize comparative evaluations of algorithms across diverse regional contexts and varying weather conditions. Explicitly incorporating datasets from different geographic regions and diverse meteorological scenarios would significantly enhance model robustness and generalizability validation. This approach would facilitate a deeper understanding of current limitations and support the development of more resilient and adaptable algorithms suitable for real-world operational environments. Moreover, exploring techniques such as visualization methods and simpler models, like decision trees, could enhance algorithm transparency and interpretability. These advancements would not only improve user trust in AI-based systems but also provide valuable insights for refining models to address specific challenges in road signage detection.

6. Conclusions

The integration of UAVs and AI techniques has significantly transformed the management of road signage inventories. The automation of previously manual and error-prone processes has led to a notable improvement in the accuracy and efficiency of data collection, contributing to safer and more efficient road infrastructure. Reviewed studies show that deep learning algorithms, such as YOLOv4 and Faster R-CNN, have achieved high accuracy rates in detecting and classifying traffic signs. This effectiveness underscores the importance of implementing advanced AI techniques in automated signage detection, although variability in environmental conditions must be considered to ensure reliable results in real-world settings.

The combination of data from UAVs, RGB cameras, and LiDAR sensors has proven to be a successful strategy for improving the quality of information capture. The fusion of these data not only increases the accuracy of object detection but also allows for a more holistic understanding of the urban environment, thus optimizing road signage maintenance. Although the application of simultaneous localization and mapping (SLAM) techniques has improved the autonomous navigation of UAVs, limitations still exist in their effectiveness in dense urban environments. A more innovative approach is required to overcome obstacles presented by low visibility and visual interference, ensuring the creation of accurate maps in real-time.

The review has identified a lack of standardization in performance evaluation methods for AI models used in signage detection. Establishing clear protocols and evaluation metrics would be crucial to facilitate comparability between studies and allow for result replication, contributing to the advancement of the field. Despite significant advances, it is crucial to continue research to overcome current challenges in the automation of road signage inventories. Future studies should focus on improving algorithms, optimizing computational processes, and collecting diversified datasets, ensuring that these technologies can be effectively applied in a variety of real-world scenarios and conditions.

Author Contributions

The authors’ contributions were distributed as follows: G.S.-B., methodology, validation, investigation, original draft preparation, and manuscript review and editing; J.R. and D.T.-F.-B., conceptualization and experimental design; O.A.-G., formal analysis and interpretation of results; A.B., critical review and manuscript editing; J.M.L.-G., resources, funding, visualization and overseeing the overall development of the study. All authors have read and agreed to the published version of the manuscript.

Funding

The authors were supported by the Vitoria-Gasteiz Mobility Lab Foundation, a governmental organization of the Provincial Council of Araba and the local council of Vitoria-Gasteiz under the following project grant: “Generación de mapas mediante drones e Inteligencia Computacional” and “Generación de Inventario Automatizado de Señalética mediante Drones e Inteligencia Computacional”.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cai, H.; Wang, Y.; Lin, Y.; Li, S.; Wang, M.; Teng, F. Systematic Comparison of Objects Classification Methods Based on ALS and Optical Remote Sensing Images in Urban Areas. Electronics 2022, 11, 3041. [Google Scholar] [CrossRef]
Arief, R.W.; Nurtanio, I.; Samman, F.A. Traffic Signs Detection and Recognition System Using the YOLOv4 Algorithm. In Proceedings of the 2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), Bandung, Indonesia, 28–30 April 2021; pp. 1–6. [Google Scholar] [CrossRef]
Jiang, X.; Cui, Q.; Wang, C.; Wang, F.; Zhao, Y.; Hou, Y.; Zhuang, R.; Mei, Y.; Shi, G. A Model for Infrastructure Detection along Highways Based on Remote Sensing Images from UAVs. Sensors 2023, 23, 3847. [Google Scholar] [CrossRef]
Houben, S.; Stallkamp, J.; Salmen, J.; Schlipsing, M.; Igel, C. Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In Proceedings of the The 2013 International Joint Conference on Neural Networks (IJCNN), Dallas, TX, USA, 4–9 August 2013; pp. 1–8. [Google Scholar] [CrossRef]
Li, W.; Li, H.; Wu, Q.; Chen, X.; Ngan, K.N. Simultaneously Detecting and Counting Dense Vehicles From Drone Images. IEEE Trans. Ind. Electron. 2019, 66, 9651–9662. [Google Scholar] [CrossRef]
Outay, F.; Mengash, H.A.; Adnan, M. Applications of unmanned aerial vehicle (UAV) in road safety, traffic and highway infrastructure management: Recent advances and challenges. Transp. Res. Part A Policy Pract. 2020, 141, 116–129. [Google Scholar] [CrossRef] [PubMed]
Fernández-Sanjurjo, M.; Bosquet, B.; Mucientes, M.; Brea, V.M. Real-time visual detection and tracking system for traffic monitoring. Eng. Appl. Artif. Intell. 2019, 85, 410–420. [Google Scholar] [CrossRef]
Padilla, R.; Netto, S.L.; da Silva, E.A.B. A Survey on Performance Metrics for Object-Detection Algorithms. In Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niteroi, Brazil, 1–3 July 2020; pp. 237–242. [Google Scholar] [CrossRef]
Butilă, E.V.; Boboc, R.G. Urban Traffic Monitoring and Analysis Using Unmanned Aerial Vehicles (UAVs): A Systematic Literature Review. Remote Sens. 2022, 14, 620. [Google Scholar] [CrossRef]
Chen, H.; Hou, L.; Wu, S.; Zhang, G.; Zou, Y.; Moon, S.; Bhuiyan, M. Augmented reality, deep learning and vision-language query system for construction worker safety. Autom. Constr. 2024, 157, 105158. [Google Scholar] [CrossRef]
Huang, D.; Qin, R.; Elhashash, M. Bundle adjustment with motion constraints for uncalibrated multi-camera systems at the ground level. ISPRS J. Photogramm. Remote Sens. 2024, 211, 452–464. [Google Scholar] [CrossRef]
Di Benedetto, A.; Fiani, M. Integration of LiDAR Data into a Regional Topographic Database for the Generation of a 3D City Model. In Proceedings of the Geomatics for Green and Digital Transition; Borgogno-Mondino, E., Zamperlin, P., Eds.; Springer: Cham, Switzerland, 2022; pp. 193–208. [Google Scholar] [CrossRef]
Rashdi, R.; Martínez-Sánchez, J.; Arias, P.; Qiu, Z. Scanning Technologies to Building Information Modelling: A Review. Infrastructures 2022, 7, 49. [Google Scholar] [CrossRef]
Shao, Y.; Huang, Q.; Mei, Y.; Chu, H. MOD-YOLO: Multispectral object detection based on transformer dual-stream YOLO. Pattern Recognit. Lett. 2024, 183, 26–34. [Google Scholar] [CrossRef]
Wan, M.; Gu, G.; Qian, W.; Ren, K.; Maldague, X.; Chen, Q. Unmanned Aerial Vehicle Video-Based Target Tracking Algorithm Using Sparse Representation. IEEE Internet Things J. 2019, 6, 9689–9706. [Google Scholar] [CrossRef]
Zhang, X.; Izquierdo, E.; Chandramouli, K. Dense and Small Object Detection in UAV Vision Based on Cascade Network. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 118–126. [Google Scholar] [CrossRef]
Naranjo, M.; Fuentes, D.; Muelas, E.; Díez, E.; Ciruelo, L.; Alonso, C.; Abenza, E.; Gómez-Espinosa, R.; Luengo, I. Object Detection-Based System for Traffic Signs on Drone-Captured Images. Drones 2023, 7, 112. [Google Scholar] [CrossRef]
Wei, J.; Liu, G.; Liu, S.; Xiao, Z. A novel algorithm for small object detection based on YOLOv4. PeerJ Comput. Sci. 2023, 9, e1314. [Google Scholar] [CrossRef]
Garcia-Garcia, A.; Orts-Escolano, S.; Oprea, S.; Villena-Martinez, V.; Martinez-Gonzalez, P.; Garcia-Rodriguez, J. A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 2018, 70, 41–65. [Google Scholar] [CrossRef]
Hsieh, M.R.; Lin, Y.L.; Hsu, W.H. Drone-Based Object Counting by Spatially Regularized Regional Proposal Network. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017. [Google Scholar] [CrossRef]
Peng, D.; Bruzzone, L.; Zhang, Y.; Guan, H.; He, P. SCDNET: A novel convolutional network for semantic change detection in high resolution optical remote sensing imagery. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 102465. [Google Scholar] [CrossRef]
Kumar, A.; Kashiyama, T.; Maeda, H.; Omata, H.; Sekimoto, Y. Real-time citywide reconstruction of traffic flow from moving cameras on lightweight edge devices. ISPRS J. Photogramm. Remote Sens. 2022, 192, 115–129. [Google Scholar] [CrossRef]
Li, X.; Li, X.; Pan, H. Multi-Scale Vehicle Detection in High-Resolution Aerial Images with Context Information. IEEE Access 2020, 8, 208643–208657. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
Maxwell, A.E.; Pourmohammadi, P.; Poyner, J.D. Mapping the Topographic Features of Mining-Related Valley Fills Using Mask R-CNN Deep Learning and Digital Elevation Data. Remote Sens. 2020, 12, 547. [Google Scholar] [CrossRef]
Wang, L.; Liao, J.; Xu, C. Vehicle Detection Based on Drone Images with the Improved Faster R-CNN. In Proceedings of the 2019 11th International Conference on Machine Learning and Computing (ICMLC’19), Zhuhai China, 22–24 February 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 466–471. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems; Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2015; Volume 28. [Google Scholar] [CrossRef]
Weinmann, M.; Weinmann, M. Geospatial Computer Vision Based on Multi-Modal Data—How Valuable Is Shape Information for the Extraction of Semantic Information? Remote Sens. 2018, 10, 2. [Google Scholar] [CrossRef]
Dhulipudi, D.P.; KS, R. Multiclass Geospatial Object Detection using Machine Learning-Aviation Case Study. In Proceedings of the 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), San Antonio, TX, USA, 11–15 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
Balamuralidhar, N.; Tilon, S.; Nex, F. MultEYE: Monitoring System for Real-Time Vehicle Detection, Tracking and Speed Estimation from UAV Imagery on Edge-Computing Platforms. Remote Sens. 2021, 13, 573. [Google Scholar] [CrossRef]
Wang, Y.; Lin, Y.; Huang, H.; Wang, S.; Wen, S.; Cai, H. A Weak Sample Optimisation Method for Building Classification in a Semi-Supervised Deep Learning Framework. Remote Sens. 2023, 15, 4432. [Google Scholar] [CrossRef]
Susetyo, D.B.; Rizaldy, A.; Hariyono, M.I.; Purwono, N.; Hidayat, F.; Windiastuti, R.; Rachma, T.R.N.; Hartanto, P. A Simple But Effective Approach of Building Footprint Extraction in Topographic Mapping Acceleration. Indones. J. Geosci. 2021, 8, 329–343. [Google Scholar] [CrossRef]
Rastiveis, H.; Shams, A.; Sarasua, W.A.; Li, J. Automated extraction of lane markings from mobile LiDAR point clouds based on fuzzy inference. ISPRS J. Photogramm. Remote Sens. 2020, 160, 149–166. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Specht, O.; Specht, M.; Stateczny, A.; Specht, C. Concept of an Innovative System for Dimensioning and Predicting Changes in the Coastal Zone Topography Using UAVs and USVs (4DBatMap System). Electronics 2023, 12, 4112. [Google Scholar] [CrossRef]
Liu, H.; Ge, J.; Liu, B.; Yu, W. Multi-feature combination method for point cloud intensity feature image and UAV optical image matching. In Proceedings of the Fourth International Conference on Geoscience and Remote Sensing Mapping (GRSM 2022), Changchun, China, 4–6 November 2022; Lohani, T.K., Ed.; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2023; Volume 12551, p. 125511B. [Google Scholar] [CrossRef]
Kim, S.Y.; Yun Kwon, D.; Jang, A.; Ju, Y.K.; Lee, J.S.; Hong, S. A review of UAV integration in forensic civil engineering: From sensor technologies to geotechnical, structural and water infrastructure applications. Measurement 2024, 224, 113886. [Google Scholar] [CrossRef]
Fu, Y.; Li, C.; Yu, F.R.; Luan, T.H.; Zhang, Y. A Survey of Driving Safety With Sensing, Vehicular Communications, and Artificial Intelligence-Based Collision Avoidance. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6142–6163. [Google Scholar] [CrossRef]
Zhang, J.S.; Cao, J.; Mao, B. Application of deep learning and unmanned aerial vehicle technology in traffic flow monitoring. In Proceedings of the 2017 International Conference on Machine Learning and Cybernetics (ICMLC), Ningbo, China, 9–12 July 2017; Volume 1, pp. 189–194. [Google Scholar] [CrossRef]
Bisio, I.; Haleem, H.; Garibotto, C.; Lavagetto, F.; Sciarrone, A. Performance Evaluation and Analysis of Drone-Based Vehicle Detection Techniques From Deep Learning Perspective. IEEE Internet Things J. 2022, 9, 10920–10935. [Google Scholar] [CrossRef]
Bisio, I.; Garibotto, C.; Haleem, H.; Lavagetto, F.; Sciarrone, A. A Systematic Review of Drone Based Road Traffic Monitoring System. IEEE Access 2022, 10, 101537–101555. [Google Scholar] [CrossRef]
Kahraman, S.; Bacher, R. A comprehensive review of hyperspectral data fusion with lidar and sar data. Annu. Rev. Control 2021, 51, 236–253. [Google Scholar] [CrossRef]
Analytics, C. Web of Science Core Collection. 2024. Available online: https://www.webofscience.com (accessed on 24 October 2012).
Elsevier. ScienceDirect Home Page. 2024. Available online: https://www.sciencedirect.com (accessed on 24 October 2012).
Elsevier. Scopus Document Search. 2024. Available online: https://www.scopus.com (accessed on 24 October 2012).
Gajjar, H.; Sanyal, S.; Shah, M. A comprehensive study on lane detecting autonomous car using computer vision. Expert Syst. Appl. 2023, 233, 120929. [Google Scholar] [CrossRef]
Soilán, M.; González-Aguilera, D.; del Campo-Sánchez, A.; Hernández-López, D.; Del Pozo, S. Road marking degradation analysis using 3D point cloud data acquired with a low-cost Mobile Mapping System. Autom. Constr. 2022, 141, 104446. [Google Scholar] [CrossRef]
Wang, C.; Wen, C.; Dai, Y.; Yu, S.; Liu, M. Urban 3D modeling using mobile laser scanning: A review. Virtual Real. Intell. Hardw. 2020, 2, 175–212. [Google Scholar] [CrossRef]
Movia, A.; Beinat, A.; Crosilla, F. Shadow detection and removal in RGB VHR images for land use unsupervised classification. ISPRS J. Photogramm. Remote Sens. 2016, 119, 485–495. [Google Scholar] [CrossRef]
Lin, Y.; Zhang, H.; Li, G.; Wang, T.; Wan, L.; Lin, H. Improving Impervious Surface Extraction With Shadow-Based Sparse Representation From Optical, SAR, and LiDAR Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2417–2428. [Google Scholar] [CrossRef]
Eslamizade, F.; Rastiveis, H.; Zahraee, N.K.; Jouybari, A.; Shams, A. Decision-level fusion of satellite imagery and LiDAR data for post-earthquake damage map generation in Haiti. Arab. J. Geosci. 2021, 14, 1120. [Google Scholar] [CrossRef]
Dalla Mura, M.; Prasad, S.; Pacifici, F.; Gamba, P.; Chanussot, J.; Benediktsson, J.A. Challenges and Opportunities of Multimodality and Data Fusion in Remote Sensing. Proc. IEEE 2015, 103, 1585–1601. [Google Scholar] [CrossRef]
Han, Y.; Qin, R.; Huang, X. Assessment of dense image matchers for digital surface model generation using airborne and spaceborne images—An update. Photogramm. Rec. 2020, 35, 58–80. [Google Scholar] [CrossRef]
Ryu, K.B.; Kang, S.J.; Jeong, S.I.; Jeong, M.S.; Park, K.R. CN4SRSS: Combined network for super-resolution reconstruction and semantic segmentation in frontal-viewing camera images of vehicle. Eng. Appl. Artif. Intell. 2024, 130, 107673. [Google Scholar] [CrossRef]
Hüthwohl, P.; Brilakis, I. Detecting healthy concrete surfaces. Adv. Eng. Inform. 2018, 37, 150–162. [Google Scholar] [CrossRef]
Jian, L.; Li, Z.; Yang, X.; Wu, W.; Ahmad, A.; Jeon, G. Combining Unmanned Aerial Vehicles With Artificial-Intelligence Technology for Traffic-Congestion Recognition: Electronic Eyes in the Skies to Spot Clogged Roads. IEEE Consum. Electron. Mag. 2019, 8, 81–86. [Google Scholar] [CrossRef]
Shawky, M.; Alsobky, A.; Al Sobky, A.; Hassan, A. Traffic safety assessment for roundabout intersections using drone photography and conflict technique. Ain Shams Eng. J. 2023, 14, 102115. [Google Scholar] [CrossRef]
Gupta, A.; Mhala, P.; Mangal, M.; Yadav, K.; Sharma, S. Traffic Sign Sensing: A Deep Learning approach for enhanced Road Safety. Preprint (Version 1). 2024. Available online: https://www.researchsquare.com/article/rs-3889986/v1 (accessed on 12 March 2025). [CrossRef]
Tian, B.; Yao, Q.; Gu, Y.; Wang, K.; Li, Y. Video processing techniques for traffic flow monitoring: A survey. In Proceedings of the 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC, USA, 5–7 October 2011; pp. 1103–1108. [Google Scholar] [CrossRef]
Ke, R.; Li, Z.; Tang, J.; Pan, Z.; Wang, Y. Real-Time Traffic Flow Parameter Estimation From UAV Video Based on Ensemble Classifier and Optical Flow. IEEE Trans. Intell. Transp. Syst. 2019, 20, 54–64. [Google Scholar] [CrossRef]
Eskandari Torbaghan, M.; Sasidharan, M.; Reardon, L.; Muchanga-Hvelplund, L.C. Understanding the potential of emerging digital technologies for improving road safety. Accid. Anal. Prev. 2022, 166, 106543. [Google Scholar] [CrossRef]
Hou, Y.; Biljecki, F. A comprehensive framework for evaluating the quality of street view imagery. Int. J. Appl. Earth Obs. Geoinf. 2022, 115, 103094. [Google Scholar] [CrossRef]
Yun, T.; Li, J.; Ma, L.; Zhou, J.; Wang, R.; Eichhorn, M.P.; Zhang, H. Status, advancements and prospects of deep learning methods applied in forest studies. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103938. [Google Scholar] [CrossRef]
Bayomi, N.; Fernandez, J.E. Eyes in the Sky: Drones Applications in the Built Environment under Climate Change Challenges. Drones 2023, 7, 637. [Google Scholar] [CrossRef]
Tran, T.H.P.; Jeon, J.W. Accurate Real-Time Traffic Light Detection Using YOLOv4. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics—Asia (ICCE-Asia), Seoul, Republic of Korea, 1–3 November 2020; pp. 1–4. [Google Scholar] [CrossRef]
Singh, R.; Danish, M.; Purohit, V.; Siddiqui, A. Traffic Sign Detection using YOLOv4. Int. J. Creat. Res. Thoughts 2021, 9, i891–i897. [Google Scholar]
Wang, Z.; Men, S.; Bai, Y.; Yuan, Y.; Wang, J.; Wang, K.; Zhang, L. Improved Small Object Detection Algorithm CRL-YOLOv5. Sensors 2024, 24, 6437. [Google Scholar] [CrossRef]
Feng, F.; Hu, Y.; Li, W.; Yang, F. Improved YOLOv8 algorithms for small object detection in aerial imagery. J. King Saud Univ.-Comput. Inf. Sci. 2024, 36, 102113. [Google Scholar] [CrossRef]
Flores-Calero, M.; Astudillo, C.A.; Guevara, D.; Maza, J.; Lita, B.S.; Defaz, B.; Ante, J.S.; Zabala-Blanco, D.; Armingol Moreno, J.M. Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review. Mathematics 2024, 12, 297. [Google Scholar] [CrossRef]
Cao, J.; Li, P.; Zhang, H.; Su, G. An Improved YOLOv4 Lightweight Traffic Sign Detection Algorithm. IAENG Int. J. Comput. Sci. 2023, 50, 825–831. [Google Scholar]
Nepal, U.; Eslamiat, H. Comparing YOLOv3, YOLOv4 and YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs. Sensors 2022, 22, 464. [Google Scholar] [CrossRef]
Elamin, A.; El-Rabbany, A. UAV-Based Multi-Sensor Data Fusion for Urban Land Cover Mapping Using a Deep Convolutional Neural Network. Remote Sens. 2022, 14, 4298. [Google Scholar] [CrossRef]
Jiang, F.; Ma, L.; Broyd, T.; Chen, W.; Luo, H. Building digital twins of existing highways using map data based on engineering expertise. Autom. Constr. 2022, 134, 104081. [Google Scholar] [CrossRef]
Truong-Hong, L.; Lindenbergh, R. Automatically extracting surfaces of reinforced concrete bridges from terrestrial laser scanning point clouds. Autom. Constr. 2022, 135, 104127. [Google Scholar] [CrossRef]
Gao, Q.; Shen, X. ThickSeg: Efficient semantic segmentation of large-scale 3D point clouds using multi-layer projection. Image Vis. Comput. 2021, 108, 104161. [Google Scholar] [CrossRef]
Kaijaluoto, R.; Kukko, A.; El Issaoui, A.; Hyyppä, J.; Kaartinen, H. Semantic segmentation of point cloud data using raw laser scanner measurements and deep neural networks. ISPRS Open J. Photogramm. Remote Sens. 2022, 3, 100011. [Google Scholar] [CrossRef]
Li, W.; Zheng, T.; Yang, Z.; Li, M.; Sun, C.; Yang, X. Classification and detection of insects from field images using deep learning for smart pest management: A systematic review. Ecol. Inform. 2021, 66, 101460. [Google Scholar] [CrossRef]
Gao, X.; Bai, X.; Zhou, K. Research on the Application of Drone Remote Sensing and Target Detection in Civilian Fields. In Proceedings of the 2023 International Conference on High Performance Big Data and Intelligent Systems (HDIS), Macau, China, 6–8 December 2023; pp. 108–112. [Google Scholar] [CrossRef]
Boudiaf, A.; Sumaiti, A.A.; Dias, J. Image-based Obstacle Avoidance using 3DConv Network for Rocky Environment. In Proceedings of the 2022 IEEE International Symposium on Robotic and Sensors Environments (ROSE), Abu Dhabi, United Arab Emirates, 14–15 November 2022; pp. 01–07. [Google Scholar] [CrossRef]
Wang, Y.; Chen, Q.; Zhu, Q.; Liu, L.; Li, C.; Zheng, D. A Survey of Mobile Laser Scanning Applications and Key Techniques over Urban Areas. Remote Sens. 2019, 11, 1540. [Google Scholar] [CrossRef]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Los Alamitos, CA, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar] [CrossRef]
Le, X.; Wang, Y.; Jo, J. Combining Deep and Handcrafted Image Features for Vehicle Classification in Drone Imagery. In Proceedings of the 2018 Digital Image Computing: Techniques and Applications (DICTA), Canberra, ACT, Australia, 10–13 December 2018; pp. 1–6. [Google Scholar] [CrossRef]
Acatay, O.; Sommer, L.; Schumann, A.; Beyerer, J. Comprehensive Evaluation of Deep Learning based Detection Methods for Vehicle Detection in Aerial Imagery. In Proceedings of the 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018; pp. 1–6. [Google Scholar] [CrossRef]
Sommer, L.W.; Schuchert, T.; Beyerer, J. Fast Deep Vehicle Detection in Aerial Images. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 311–319. [Google Scholar] [CrossRef]
Seidaliyeva, U.; Ilipbayeva, L.; Taissariyeva, K.; Smailov, N.; Matson, E.T. Advances and Challenges in Drone Detection and Classification Techniques: A State-of-the-Art Review. Sensors 2024, 24, 125. [Google Scholar] [CrossRef]
Xiuling, Z.; Huijuan, W.; Yu, S.; Gang, C.; Suhua, Z.; Quanbo, Y. Starting from the structure: A review of small object detection based on deep learning. Image Vis. Comput. 2024, 146, 105054. [Google Scholar] [CrossRef]
Ruan, J.; Cui, H.; Huang, Y.; Li, T.; Wu, C.; Zhang, K. A review of occluded objects detection in real complex scenarios for autonomous driving. Green Energy Intell. Transp. 2023, 2, 100092. [Google Scholar] [CrossRef]
Akshatha, K.R.; Karunakar, A.K.; Satish Shenoy, B.; Phani Pavan, K.; Chinmay, V.D. Manipal-UAV person detection dataset: A step towards benchmarking dataset and algorithms for small object detection. ISPRS J. Photogramm. Remote Sens. 2023, 195, 77–89. [Google Scholar] [CrossRef]
Yasmine, G.; Maha, G.; Hicham, M. Anti-drone systems: An attention based improved YOLOv7 model for a real-time detection and identification of multi-airborne target. Intell. Syst. Appl. 2023, 20, 200296. [Google Scholar] [CrossRef]
Souza, B.J.; Stefenon, S.F.; Singh, G.; Freire, R.Z. Hybrid-YOLO for classification of insulators defects in transmission lines based on UAV. Int. J. Electr. Power Energy Syst. 2023, 148, 108982. [Google Scholar] [CrossRef]
Bhavsar, Y.M.; Zaveri, M.S.; Raval, M.S.; Zaveri, S.B. Vision-based investigation of road traffic and violations at urban roundabout in India using UAV video: A case study. Transp. Eng. 2023, 14, 100207. [Google Scholar] [CrossRef]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
Tabernik, D.; Muhovič, J.; Skočaj, D. Dense center-direction regression for object counting and localization with point supervision. Pattern Recognit. 2024, 153, 110540. [Google Scholar] [CrossRef]
Koelle, M.; Laupheimer, D.; Walter, V.; Haala, N.; Soergel, U. Which 3D data representation does the crowd like best? Crowd-based active learning for coupled semantic segmentation of point clouds and textured meshes. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2021, 2, 93–100. [Google Scholar] [CrossRef]
Lo Bianco, L.C.; Beltrán, J.; López, G.F.; García, F.; Al-Kaff, A. Joint semantic segmentation of road objects and lanes using Convolutional Neural Networks. Robot. Auton. Syst. 2020, 133, 103623. [Google Scholar] [CrossRef]
Villareal, M.K.; Tongco, A.F. Multi-sensor Fusion Workflow for Accurate Classification and Mapping of Sugarcane Crops. Eng. Technol. Appl. Sci. Res. 2019, 9, 4085–4091. [Google Scholar] [CrossRef]
Quan, Y.; Tong, Y.; Feng, W.; Dauphin, G.; Huang, W.; Zhu, W.; Xing, M. Relative Total Variation Structure Analysis-Based Fusion Method for Hyperspectral and LiDAR Data Classification. Remote Sens. 2021, 13, 1143. [Google Scholar] [CrossRef]
Ahmed, N.; Islam, M.N.; Tuba, A.S.; Mahdy, M.; Sujauddin, M. Solving visual pollution with deep learning: A new nexus in environmental management. J. Environ. Manag. 2019, 248, 109253. [Google Scholar] [CrossRef]
Alizadeh Kharazi, B.; Behzadan, A.H. Flood depth mapping in street photos with image processing and deep neural networks. Comput. Environ. Urban Syst. 2021, 88, 101628. [Google Scholar] [CrossRef]
Kwan, C.; Gribben, D.; Ayhan, B.; Li, J.; Bernabe, S.; Plaza, A. An Accurate Vegetation and Non-Vegetation Differentiation Approach Based on Land Cover Classification. Remote Sens. 2020, 12, 3880. [Google Scholar] [CrossRef]
Fernández-Alvarado, J.; Fernández-Rodríguez, S. 3D environmental urban BIM using LiDAR data for visualisation on Google Earth. Autom. Constr. 2022, 138, 104251. [Google Scholar] [CrossRef]
Xiao, W.; Cao, H.; Tang, M.; Zhang, Z.; Chen, N. 3D urban object change detection from aerial and terrestrial point clouds: A review. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103258. [Google Scholar] [CrossRef]
Lugo, G.; Li, R.; Chauhan, R.; Wang, Z.; Tiwary, P.; Pandey, U.; Patel, A.; Rombough, S.; Schatz, R.; Cheng, I. LiSurveying: A high-resolution TLS-LiDAR benchmark. Comput. Graph. 2022, 107, 116–130. [Google Scholar] [CrossRef]
Balali, V.; Jahangiri, A.; Machiani, S.G. Multi-class US traffic signs 3D recognition and localization via image-based point cloud model using color candidate extraction and texture-based recognition. Adv. Eng. Inform. 2017, 32, 263–274. [Google Scholar] [CrossRef]
Gao, C.; Guo, M.; Zhao, J.; Cheng, P.; Zhou, Y.; Zhou, T.; Guo, K. An automated multi-constraint joint registration method for mobile LiDAR point cloud in repeated areas. Measurement 2023, 222, 113620. [Google Scholar] [CrossRef]
Bolkas, D.; Guthrie, K.; Durrutya, L. sUAS LiDAR and Photogrammetry Evaluation in Various Surfaces for Surveying and Mapping. J. Surv. Eng. 2024, 150, 04023021. [Google Scholar] [CrossRef]
Hasanpour Zaryabi, E.; Saadatseresht, M.; Ghanbari Parmehr, E. An object-based classification framework for als point cloud in urban areas. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 279–286. [Google Scholar] [CrossRef]
Fang, L.; You, Z.; Shen, G.; Chen, Y.; Li, J. A joint deep learning network of point clouds and multiple views for roadside object classification from lidar point clouds. ISPRS J. Photogramm. Remote Sens. 2022, 193, 115–136. [Google Scholar] [CrossRef]
Suleymanoglu, B.; Gurturk, M.; Yilmaz, Y.; Soycan, A.; Soycan, M. Comparison of Unmanned Aerial Vehicle-LiDAR and Image-Based Mobile Mapping System for Assessing Road Geometry Parameters via Digital Terrain Models. Transp. Res. Rec. 2023, 2677, 617–632. [Google Scholar] [CrossRef]
Naimaee, R.; Saadatseresht, M.; Omidalizarandi, M. Automatic extraction of control points from 3D Lidar mobile mapping and UAV imagery for aerial triangulation. In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Proceedings of the GeoSpatial Conference 2022—Joint 6th SMPR and 4th GIResearch Conferences, Tehran, Iran, 19–22 February 2023; Curran: Red Hook, NY, USA, 2023; Volume X-4/W1-2022, pp. 581–588. [Google Scholar] [CrossRef]
Ma, H.; Liu, Y.; Ren, Y.; Wang, D.; Yu, L.; Yu, J. Improved CNN Classification Method for Groups of Buildings Damaged by Earthquake, Based on High Resolution Remote Sensing Images. Remote Sens. 2020, 12, 260. [Google Scholar] [CrossRef]
Zhu, P.; Wen, L.; Du, D.; Bian, X.; Fan, H.; Hu, Q.; Ling, H. Detection and Tracking Meet Drones Challenge. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 7380–7399. [Google Scholar] [CrossRef]
Pi, Y.; Nath, N.D.; Behzadan, A.H. Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Adv. Eng. Inform. 2020, 43, 101009. [Google Scholar] [CrossRef]
Zhou, T.; Hasheminasab, S.M.; Ravi, R.; Habib, A. LiDAR-Aided Interior Orientation Parameters Refinement Strategy for Consumer-Grade Cameras Onboard UAV Remote Sensing Systems. Remote Sens. 2020, 12, 2268. [Google Scholar] [CrossRef]
Bakirci, M. Utilizing YOLOv8 for enhanced traffic monitoring in intelligent transportation systems (ITS) applications. Digit. Signal Process. 2024, 152, 104594. [Google Scholar] [CrossRef]
Lin, Y.C.; Habib, A. Semantic segmentation of bridge components and road infrastructure from mobile LiDAR data. ISPRS Open J. Photogramm. Remote Sens. 2022, 6, 100023. [Google Scholar] [CrossRef]
Vassilev, H.; Laska, M.; Blankenbach, J. Uncertainty-aware point cloud segmentation for infrastructure projects using Bayesian deep learning. Autom. Constr. 2024, 164, 105419. [Google Scholar] [CrossRef]
Guan, H.; Lei, X.; Yu, Y.; Zhao, H.; Peng, D.; Marcato Junior, J.; Li, J. Road marking extraction in UAV imagery using attentive capsule feature pyramid network. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102677. [Google Scholar] [CrossRef]
Balado, J.; González, E.; Arias, P.; Castro, D. Novel Approach to Automatic Traffic Sign Inventory Based on Mobile Mapping System Data and Deep Learning. Remote Sens. 2020, 12, 442. [Google Scholar] [CrossRef]
Sun, Z.L.; Wang, H.; Lau, W.S.; Seet, G.; Wang, D. Application of BW-ELM model on traffic sign recognition. Neurocomputing 2014, 128, 153–159. [Google Scholar] [CrossRef]
Cheng, J.C.; Wang, M. Automated detection of sewer pipe defects in closed-circuit television images using deep learning techniques. Autom. Constr. 2018, 95, 155–171. [Google Scholar] [CrossRef]
Firmansyah, H.R.; Sarli, P.W.; Twinanda, A.P.; Santoso, D.; Imran, I. Building typology classification using convolutional neural networks utilizing multiple ground-level image process for city-scale rapid seismic vulnerability assessment. Eng. Appl. Artif. Intell. 2024, 131, 107824. [Google Scholar] [CrossRef]
Buján, S.; Guerra-Hernández, J.; González-Ferreiro, E.; Miranda, D. Forest Road Detection Using LiDAR Data and Hybrid Classification. Remote Sens. 2021, 13, 393. [Google Scholar] [CrossRef]
Ma, X.; Li, X.; Tang, X.; Zhang, B.; Yao, R.; Lu, J. Deconvolution Feature Fusion for traffic signs detection in 5G driven unmanned vehicle. Phys. Commun. 2021, 47, 101375. [Google Scholar] [CrossRef]
Qiu, Z.; Martínez-Sánchez, J.; Brea, V.M.; López, P.; Arias, P. Low-cost mobile mapping system solution for traffic sign segmentation using Azure Kinect. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102895. [Google Scholar] [CrossRef]
Saovana, N.; Yabuki, N.; Fukuda, T. Automated point cloud classification using an image-based instance segmentation for structure from motion. Autom. Constr. 2021, 129, 103804. [Google Scholar] [CrossRef]
Zheng, J.; Chen, L.; Wang, J.; Chen, Q.; Huang, X.; Jiang, L. Knowledge distillation with T-Seg guiding for lightweight automated crack segmentation. Autom. Constr. 2024, 166, 105585. [Google Scholar] [CrossRef]
Wei, W.; Shu, Y.; Liu, J.; Dong, L.; Jia, L.; Wang, J.; Guo, Y. Research on a hierarchical feature-based contour extraction method for spatial complex truss-like structures in aerial images. Eng. Appl. Artif. Intell. 2024, 127, 107313. [Google Scholar] [CrossRef]
Zhang, Y.; Carballo, A.; Yang, H.; Takeda, K. Perception and sensing for autonomous vehicles under adverse weather conditions: A survey. ISPRS J. Photogramm. Remote Sens. 2023, 196, 146–177. [Google Scholar] [CrossRef]
Aslan, M.F.; Durdu, A.; Yusefi, A.; Yilmaz, A. HVIOnet: A deep learning based hybrid visual–inertial odometry approach for unmanned aerial system position estimation. Neural Netw. 2022, 155, 461–474. [Google Scholar] [CrossRef] [PubMed]
White, C.T.; Reckling, W.; Petrasova, A.; Meentemeyer, R.K.; Mitasova, H. Rapid-DEM: Rapid Topographic Updates through Satellite Change Detection and UAS Data Fusion. Remote Sens. 2022, 14, 1718. [Google Scholar] [CrossRef]
Chen, S.; Shi, W.; Zhou, M.; Zhang, M.; Chen, P. Automatic Building Extraction via Adaptive Iterative Segmentation With LiDAR Data and High Spatial Resolution Imagery Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 2081–2095. [Google Scholar] [CrossRef]
Chow, J.K.; Liu, K.F.; Tan, P.S.; Su, Z.; Wu, J.; Li, Z.; Wang, Y.H. Automated defect inspection of concrete structures. Autom. Constr. 2021, 132, 103959. [Google Scholar] [CrossRef]
Dong, H.; Chen, X.; Särkkä, S.; Stachniss, C. Online pole segmentation on range images for long-term LiDAR localization in urban environments. Robot. Auton. Syst. 2023, 159, 104283. [Google Scholar] [CrossRef]
Gupta, A.; Anpalagan, A.; Guan, L.; Khwaja, A.S. Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues. Array 2021, 10, 100057. [Google Scholar] [CrossRef]
Xie, C.; Liu, Q.; Chen, B.; Hao, Z. Evaluation and analysis of feature point detection methods based on vSLAM systems. Image Vis. Comput. 2024, 146, 105015. [Google Scholar] [CrossRef]
Wen, L.H.; Jo, K.H. Deep learning-based perception systems for autonomous driving: A comprehensive survey. Neurocomputing 2022, 489, 255–270. [Google Scholar] [CrossRef]
Balaska, V.; Bampis, L.; Gasteratos, A. Self-localization based on terrestrial and satellite semantics. Eng. Appl. Artif. Intell. 2022, 111, 104824. [Google Scholar] [CrossRef]
Zhang, P.; Du, P.; Lin, C.; Wang, X.; Li, E.; Xue, Z.; Bai, X. A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data. Remote Sens. 2020, 12, 3764. [Google Scholar] [CrossRef]
Benedek, C.; Majdik, A.; Nagy, B.; Rozsa, Z.; Sziranyi, T. Positioning and perception in LIDAR point clouds. Digit. Signal Process. 2021, 119, 103193. [Google Scholar] [CrossRef]
Luo, S.; Wen, S.; Zhang, L.; Lan, Y.; Chen, X. Extraction of crop canopy features and decision-making for variable spraying based on unmanned aerial vehicle LiDAR data. Comput. Electron. Agric. 2024, 224, 109197. [Google Scholar] [CrossRef]
Behley, J.; Stachniss, C. Efficient Surfel-Based SLAM using 3D Laser Range Data in Urban Environments. In Robotics: Science and Systems XIV; 2018; Available online: https://www.roboticsproceedings.org/rss14/p16.pdf (accessed on 12 March 2025). [CrossRef]
Ouattara, I.; Korhonen, V.; Visala, A. LiDAR-odometry based UAV pose estimation in young forest environment. IFAC-PapersOnLine 2022, 55, 95–100. [Google Scholar] [CrossRef]
Häne, C.; Heng, L.; Lee, G.H.; Fraundorfer, F.; Furgale, P.; Sattler, T.; Pollefeys, M. 3D visual perception for self-driving cars using a multi-camera system: Calibration, mapping, localization, and obstacle detection. Image Vis. Comput. 2017, 68, 14–27. [Google Scholar] [CrossRef]
Narazaki, Y.; Hoskere, V.; Chowdhary, G.; Spencer, B.F. Vision-based navigation planning for autonomous post-earthquake inspection of reinforced concrete railway viaducts using unmanned aerial vehicles. Autom. Constr. 2022, 137, 104214. [Google Scholar] [CrossRef]
Cong, Y.; Chen, C.; Yang, B.; Liang, F.; Ma, R.; Zhang, F. CAOM: Change-aware online 3D mapping with heterogeneous multi-beam and push-broom LiDAR point clouds. ISPRS J. Photogramm. Remote Sens. 2023, 195, 204–219. [Google Scholar] [CrossRef]
Qian, B.; Al Said, N.; Dong, B. New technologies for UAV navigation with real-time pattern recognition. Ain Shams Eng. J. 2024, 15, 102480. [Google Scholar] [CrossRef]
Chouhan, R.; Dhamaniya, A.; Antoniou, C. Analysis of driving behavior in weak lane disciplined traffic at the merging and diverging sections using unmanned aerial vehicle data. Phys. A Stat. Mech. Its Appl. 2024, 646, 129865. [Google Scholar] [CrossRef]
Fu, R.; Chen, C.; Yan, S.; Heidari, A.A.; Wang, X.; Escorcia-Gutierrez, J.; Mansour, R.F.; Chen, H. Gaussian similarity-based adaptive dynamic label assignment for tiny object detection. Neurocomputing 2023, 543, 126285. [Google Scholar] [CrossRef]
Cano-Ortiz, S.; Lloret Iglesias, L.; Martinez Ruiz del Árbol, P.; Lastra-González, P.; Castro-Fresno, D. An end-to-end computer vision system based on deep learning for pavement distress detection and quantification. Constr. Build. Mater. 2024, 416, 135036. [Google Scholar] [CrossRef]
Buters, T.M.; Bateman, P.W.; Robinson, T.; Belton, D.; Dixon, K.W.; Cross, A.T. Methodological Ambiguity and Inconsistency Constrain Unmanned Aerial Vehicles as A Silver Bullet for Monitoring Ecological Restoration. Remote Sens. 2019, 11, 1180. [Google Scholar] [CrossRef]
Alsadik, B.; Remondino, F. Flight Planning for LiDAR-Based UAS Mapping Applications. ISPRS Int. J. Geo-Inf. 2020, 9, 378. [Google Scholar] [CrossRef]
Bakuła, K.; Pilarska, M.; Salach, A.; Kurczyński, Z. Detection of Levee Damage Based on UAS Data—Optical Imagery and LiDAR Point Clouds. ISPRS Int. J. Geo-Inf. 2020, 9, 248. [Google Scholar] [CrossRef]
Yasin Yiğit, A.; Uysal, M. Virtual reality visualisation of automatic crack detection for bridge inspection from 3D digital twin generated by UAV photogrammetry. Measurement 2025, 242, 115931. [Google Scholar] [CrossRef]
Nath, N.D.; Cheng, C.S.; Behzadan, A.H. Drone mapping of damage information in GPS-Denied disaster sites. Adv. Eng. Inform. 2022, 51, 101450. [Google Scholar] [CrossRef]
Hyyppä, E.; Muhojoki, J.; Yu, X.; Kukko, A.; Kaartinen, H.; Hyyppä, J. Efficient coarse registration method using translation- and rotation-invariant local descriptors towards fully automated forest inventory. ISPRS Open J. Photogramm. Remote Sens. 2021, 2, 100007. [Google Scholar] [CrossRef]
Soilán, M.; Truong-Hong, L.; Riveiro, B.; Laefer, D. Automatic extraction of road features in urban environments using dense ALS data. Int. J. Appl. Earth Obs. Geoinf. 2018, 64, 226–236. [Google Scholar] [CrossRef]
Fu, Y.; Niu, Y.; Wang, L.; Li, W. Individual-Tree Segmentation from UAV–LiDAR Data Using a Region-Growing Segmentation and Supervoxel-Weighted Fuzzy Clustering Approach. Remote Sens. 2024, 16, 608. [Google Scholar] [CrossRef]
Guan, H.; Sun, X.; Su, Y.; Hu, T.; Wang, H.; Wang, H.; Peng, C.; Guo, Q. UAV-lidar aids automatic intelligent powerline inspection. Int. J. Electr. Power Energy Syst. 2021, 130, 106987. [Google Scholar] [CrossRef]
Li, X.; Liu, C.; Wang, Z.; Xie, X.; Li, D.; Xu, L. Airborne LiDAR: State-of-the-art of system design, technology and application. Meas. Sci. Technol. 2020, 32, 032002. [Google Scholar] [CrossRef]
Tang, H.; Kamei, S.; Morimoto, Y. Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks. Algorithms 2023, 16, 59. [Google Scholar] [CrossRef]
AbuKhait, J. US Road Sign Detection and Visibility Estimation using Artificial Intelligence Techniques. Int. J. Adv. Comput. Sci. Appl. 2024, 15. [Google Scholar] [CrossRef]
Wang, W.; He, H.; Ma, C. An improved Deeplabv3+ model for semantic segmentation of urban environments targeting autonomous driving. Int. J. Comput. Commun. Control 2023, 18, e5879. [Google Scholar] [CrossRef]
Zayani, H.M. Unveiling the Potential of YOLOv9 through Comparison with YOLOv8. Int. J. Intell. Syst. Appl. Eng. 2024, 12, 2845–2854. [Google Scholar]
Mei, J.; Zhu, W. BGF-YOLOv10: Small Object Detection Algorithm from Unmanned Aerial Vehicle Perspective Based on Improved YOLOv10. Sensors 2024, 24, 6911. [Google Scholar] [CrossRef]
Cheng, S.; Han, Y.; Wang, Z.; Liu, S.; Yang, B.; Li, J. An Underwater Object Recognition System Based on Improved YOLOv11. Electronics 2025, 14, 201. [Google Scholar] [CrossRef]
Mukhamediev, R.I.; Symagulov, A.; Kuchin, Y.; Zaitseva, E.; Bekbotayeva, A.; Yakunin, K.; Assanov, I.; Levashenko, V.; Popova, Y.; Akzhalova, A.; et al. Review of Some Applications of Unmanned Aerial Vehicles Technology in the Resource-Rich Country. Appl. Sci. 2021, 11, 10171. [Google Scholar] [CrossRef]
Ma, L.; Li, Y.; Li, J.; Wang, C.; Wang, R.; Chapman, M.A. Mobile Laser Scanned Point-Clouds for Road Object Detection and Extraction: A Review. Remote Sens. 2018, 10, 1531. [Google Scholar] [CrossRef]
Shah, S.F.A.; Mazhar, T.; Al Shloul, T.; Shahzad, T.; Hu, Y.C.; Mallek, F.; Hamam, H. Applications, challenges, and solutions of unmanned aerial vehicles in smart city using blockchain. PeerJ Comput. Sci. 2024, 10, e1776. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Complete workflow for the automation of road signage inventory.

Figure 2. PRISMA flow diagram.

Figure 3. Distribution of articles.

Figure 4. Multisensor data fusion.

Figure 5. Traffic sign detection.

Figure 6. Example of semantic segmentation “Ibarra/Ecuador”.

Figure 7. Traffic sign detection using multisensor data fusion.

Figure 8. Evolution of Research Lines.

Table 1. Search queries in the databases.

Database	Equation
Web of Science	TS= ((“traffic sign*” OR “road sign” OR “signage”) AND (“detect” OR “inventor” OR “manage”) AND (“drone” OR “UAV” OR “unmanned aerial vehicle”) AND (“artificial intelligence” OR “machine learning” OR “deep learning”) NOT (“education” OR “medical”))
ScienceDirect	(“traffic sign” OR “road sign” OR “signage”) AND (“detection” OR “inventory”) AND (“drones” OR “UAV”) AND (“artificial intelligence” OR “machine learning”)
Scopus	TITLE-ABS-KEY ((signage AND detection) OR (traffic AND sign AND inventory) OR (lidar)) AND (drone OR uav) AND (“machine learning” OR “deep learning”) AND (geospatial AND analysis OR gis OR “geographic information systems”) AND (LIMIT-TO (LANGUAGE, “English”))

Table 2. Performance comparison between YOLOv4 and Faster R-CNN in different environments.

Model	Urban Accuracy (%)	Rural Accuracy (%)	Urban Recall (%)	Rural Recall (%)	Ref.
Faster R-CNN	87	90	83	86	[1]
YOLOv4	92	88	85	82	[2]

Table 3. Comparison between structures of YOLOv3, YOLOv4, and YOLOv5 [71].

	YOLOv3	YOLOv4	YOLOv5
Neural network type	Fully convolutional	Fully convolutional	Fully convolutional
Backbone feature extractor	Darknet-53	CSPDarknet53	CSPDarknet53
Loss function	Binary cross-entropy	Binary cross-entropy	Binary cross-entropy and Logits loss function
Neck	FPN	SSP and PANet	PANet
Head	YOLO layer	YOLO layer	YOLO layer

Table 5. Studies on UAVs and AI algorithms for traffic sign detection.

UAVs Type	Sensors Used	AI Algorithm	Metric	Application Domain	Key Results	Ref.
DJI ZenmuseP1	High-resolution camera	MSA-CenterNet	mAP: 86.7%; Precision: 89.2%; Recall: 90.6%	Detection of infrastructure along roads	Significant improvement in small object detection; outperformed other algorithms like SSD, Faster R-CNN, RetinaNet, and YOLOv5 in most categories	[3]
DJI Matrice600 replica	RGB camera	Faster R-CNN	AP mAP: 56.30%; AP: 43.8% (all areas), 60.5% (small areas), 51.8% (medium areas)	Traffic sign detection in civil infrastructures	Better performance on signs; creation of new dataset with greater sign variety	[17]
Multirotor	RGB camera	Improved YOLOv4	mAP: 52.76% (VisDrone2019); mAP: 96.98% (S2TLD)	Small object detection in aerial images and urban traffic	Reduces parameters; improves small object detection; outperforms original YOLOv4	[18]
Multirotor	LiDAR and Multispectral Camera	SCDNET	80%	Rural roads	Improvement in signage detection accuracy	[21]
Multirotor	RGB camera	Shadow detection, EAOP	mAP: 85%	Various	Segmentation of complex images	[49]
Multirotor	Front camera	DeepLab v3+	Accuracy: 93.14%	Autonomous vehicles	Accuracy in segmentation of low-resolution images	[54]
UAVs	high-resolution cameras implied	Improved YOLOv8	Improvements in mAP@0.5 and mAP@0.5:0.95	Drone aerial target detection	Enhanced precision; lighter model; superior detection; optimized efficiency	[68]
Multirotor	LiDAR	CRG/VRG	F1-Score: 0.93	Bridges and roads	Automatic segmentation and detection of road infrastructure	[74]
Custom UAVs	LiDAR	3D CNNs (SS-3DCNN)	mIoU: 80.1%	Forest environments	Semantic segmentation of point clouds applicable to road signage	[75]
DJI Inspire 2	Cameras	CNNs	95%; 85%	Visual pollution classification	Model achieved high accuracy; applicable to live videos/images	[98]
UAVs-LiDAR and MPS	LiDAR; GNSS; IMU; GoPro Hero 7; GNSS Topcon HyperPro	CSF; SfM	RMSE: 1.8–2.3 cm; Average deviation: 0.18–0.19%	Road geometry analysis	Precise extraction of geometric parameters; MPS as viable alternative to LiDAR	[109]
DJI Phantom 4 Pro	DJI camera; Velodyne Puck LiDAR; SBG Ellipse-D IMU	SIFT; SfM; PCA; Sparse Bundle Adjustment	RMSE: 0.5 pixels; RMSE: 5 cm	UAVs photogrammetry; Mobile LiDAR mapping	Automatic extraction of control points; improved accuracy; reduced acquisition time	[110]
N/A	High-resolution aerial images	CNNs Inception V3	Accuracy: 90.07%, Kappa: 0.81	Post-earthquake building group damage classification	Effective classification of building damage at block level	[111]
Various UAVs	Cameras	Various algorithms	$A P^{I o U = 0.50 : 0.05 : 0.95}$ , $A P^{I o U = 0.50}$ , $A P^{I o U = 0.75}$	Object detection in images	10,209 images; 6471 training, 548 validation, 1580 test-challenge, 1610 test-dev	[112]
UAVs and helicopters	RGB cameras	YOLOv2	80.69% mAP for helicopters; 74.48% mAP for UAVs	Object detection in post-disaster aerial images	Better performance with balanced data and pre-training on VOC; generalizable model to new disasters	[113]
Multi-rotor	RGB, RF, Audio, LiDAR	CNNs, YOLO, SSD, Faster R-CNN	Accuracy: 70–93%	UAVs detection, infrastructure inspection, forestry	Improved detection and classification; real-time processing	[114]
UAVs	High-resolution cameras	YOLOv8	Increased precision, faster processing speeds	Traffic monitoring and ITS	YOLOv8 achieved higher accuracy and speed in processing images for vehicle detection compared to YOLOv5	[115]

Table 6. Semantic segmentation studies with AI in road infrastructure.

Challenges	Segmentation Technique	AI Algorithm	Advantages	Ref.
Data management and processing of large 3D volumes	Semantic segmentation with LiDAR	CNNs	High precision in capturing 3D point clouds, even in motion	[48]
Limitation to frontal images	Semantic segmentation and super-resolution	CN4SRSS, DeepLab v3+	High accuracy in segmentation of low-resolution images	[54]
Complexity in iterative processing	Semantic segmentation of point clouds	Self-Sorting 3D Convolutional Neural Network (SS-3DCNN)	High efficiency in label assignment for point clouds	[75]
High computational cost and processing time	Mobile LiDAR, Semantic Segmentation	CNNs	Precision in segmentation of infrastructure components	[116]
Requires high computational power	Point cloud segmentation	Bayesian deep learning	Improved handling of uncertainty in complex scenarios	[117]
Complex processing in densely populated urban areas	Road marking segmentation	Feature pyramid networks	Improves accuracy in detecting objects such as road signs	[118]

Table 7. Quantitative challenges in UAVs usage and data processing.

Quantitative Challenge	Description	Ref.
Dependence on large datasets	The performance of DLMs relies on the availability of large and diverse datasets, which are not always accessible	[49]
Environmental condition limitations	Variability in lighting and weather conditions can affect the quality of data collected by UAVs, reducing detection accuracy	[116]
Lack of evaluation standards	The absence of standardized protocols for evaluating AI model performance hinders comparison between studies and affects replicability	[116]
Data processing capacity	AI algorithms used, such as YOLOv4, require robust computational processing to handle the large volumes of generated data	[117]
Real-time data processing	Multi-sensor systems present a significant challenge in requiring real-time data synchronization, which is costly and complex	[117]
High implementation costs	The integration of LiDAR sensors and RGB cameras significantly increases UAVs costs, limiting their application in real-world scenarios	[119]

Table 8. Scientific articles related to the application of DLMs in traffic sign detection and classification.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Hybrid-YOLO (YOLOv5x + ResNet-18)	Better detection and low computational effort with small dataset and adaptable to embedded systems	Similarity between components and dataset limitations	0.99262	0.99441	[90]
CNNs; data augmentation; RMSprop optimization; L2 regularization	Automated classification of visual pollutants	Limitation due to dataset size	N/A	85	[98]
Mask R-CNN, Canny detector; Hough transform for sign and pole detection	Simple and scalable method for estimating depth using signs as reference	Errors due to reflection; unusual shapes and sign inclination	N/A	100	[99]
HOG + Color, SVM, SfM, RANSAC	Accurate 3D detection, automatic cleaning, efficient reconstruction	Sign variability, occlusions, noise	N/A	90.15	[104]
HOG + BW-ELM	Efficient; accurate	Memory limitation	N/A	97.19	[120]
Faster R-CNN with ZF, VGG_CNN_M_1024 and VGG16 networks; data augmentation; hyperparameter tuning	Automatic and accurate detection of multiple defects with less preprocessing and automatic feature extraction	Balancing accuracy and speed when handling images with multiple similar defects	0.83	N/A	[121]
CNN, GSV API, GradCAM, Oversampling, Data augmentation	Automation of seismic vulnerability assessment; reduction of costs and time	GSV limitations in rural areas; possible classification errors	N/A	88.2	[122]
YOLOv8 for aerial vehicle detection	Higher accuracy and speed; better detection of small vehicles and improved architecture	Difficulties in shadows; high altitude and congestion; Confusion between similar types	79.7	80.3	[115]
DFF-YOLOv3	Improves detection of distant signs; maintains wide vision	Balancing accuracy and speed	74.8	N/A	[124]
Traffic sign segmentation using Azure Kinect	Low cost; no training required; processes high-resolution images; performs instance segmentation	Limited to speeds of 30–40 km/h; maximum distance of 16.2 m; low sensitivity to lateral signs	N/A	82.73	[125]

N/A: not mentioned.

Table 9. Relevant scientific articles for aerial image segmentation and processing.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Siamese UNet; MAC; deep supervision attention; combined loss	Pixel-level semantic change detection; multi-scale capture; effective fusion; gradient vanishing mitigation	Limited annotations; extreme class imbalance, small-scale changes	N/A	0.7306	[21]
PC1+C3 detection; EAOP, ObP and Cholesky removal	Higher accuracy without NIR; effective for irregular shadows; good grass reconstruction	False positives on dark surfaces; incomplete reconstruction; roof issues; classifier variability	N/A	97.54	[49]
CN4SRSS with ARNet	Improves segmentation in LR images; reduces computational cost and focuses on important regions	Inference speed affected with large input data	N/A	93.14	[54]
Comprehensive SVI quality evaluation framework with selected metrics and multi-scale analysis	Applicable to commercial and crowdsourced SVI with holistic evaluation and open-source code	Difficulty in automatically measuring certain metrics and potential bias in manual review	N/A	N/A	[62]
Semantic segmentation with superpoint graphs; cross-labeling; transfer learning; geometric quality control	Reduction of manual training data; adaptation to different systems and scenes; refinement of results	Classification of abutments and scanning artifacts; limited ability to identify buildings	N/A	87	[116]
KPConv with Variational Inference (VI) and Monte Carlo (MC) Dropout	Improvement in uncertainty estimation and out-of-distribution example detection	Significant increase in execution time and slight decrease in segmentation accuracy	N/A	77.52	[117]
Attentive Capsule Feature Pyramid Network (ACapsFPN)	Multi-scale feature extraction and fusion; improved feature representation through attention	Severe occlusions and highly eroded road markings	N/A	0.7366	[118]
Hierarchical feature-based contour extraction	Better accuracy in complex backgrounds; less sensitive to annotation errors; modular and transferable	Loss of contextual information in patch division	95.7	99.0	[128]

N/A: not mentioned.

Table 10. Relevant scientific articles for sensor integration and multisensor systems.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
4DBatMap System (UAVs and USVs with LiDAR, cameras and echo sounders)	Complete coverage; integration of aerial and marine data; coastal change prediction	Processing and integration of multisensor data; weather conditions	N/A	0.16–0.24 [m] GNSS RTK	[35]
Stepwise minimum spanning tree matching	Automatic registration of VLS and BLS point clouds; robust to differences in point density	Dependency on tree distribution; sensitivity to parameters	N/A	Rotation error ${< 0.06}^{\circ}$ ; Translation error $< 0.05$ m	[129]
CNN + BiLSTM for visual-inertial fusion	Does not require camera calibration; raw data processing; low computational cost	Generalization to real-world environments; real-time implementation	N/A	0.167	[130]
Land cover classification with random forest; fusion of UAVs and LiDAR DEMs	Rapid and targeted DEM updates; low cost; use of multiple data sources	Vertical alignment of DEMs; UAV flight limitations	N/A	89–91%	[131]
Iterative adaptive segmentation with LiDAR and HSRI data fusion	Overcomes shadow occlusion; improves horizontal accuracy	Variability in building shapes; complexity of urban environments	N/A	Completeness: 94.1%; Correctness: 90.3%; Quality: 85.5%	[132]

N/A: not mentioned.

Table 11. Relevant scientific articles for the use of SLAM.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Motion constraints for BA in uncalibrated multi-camera systems	Works without overlapping FoVs or precise synchronization	Sensitive to dense traffic and complex long trajectories	N/A	≤86.12% improvement in MAE	[11]
Automated inspection with $360^{\circ}$ camera LiDAR and deep learning	Flexible data acquisition; automatic defect detection 3D reconstruction; BIM integration	Sensor calibration; data alignment; false positives	N/A	90.8–93.2%	[133]
Image-based pole extraction for LiDAR localization	Rapid pole extraction; works in various environments; generates pseudo-labels for learning	Balance between accuracy and speed	N/A	76.5% (geometric) 67.5% (learning)	[134]
Evaluation of feature point detection methods in vSLAM systems	Comprehensive analysis on complex datasets; comparison of traditional and neural network-based methods	Performance variability depending on environment; trade-off between accuracy and real-time operation	N/A	Varies by method and scenario	[136]
Semantic localization with enhanced satellite map and particle filter	Combines satellite and ground data; semantic and metric descriptors; robust in absence of GNSS	Requires prior area mapping; computational cost	N/A	5.24 [m] RMSE	[138]
LiDAR for perception and positioning	High 3D precision; works in low light	High cost; sensitive to adverse weather	N/A	90.3% (segmentation)	[140]
UAVs with LiDAR and variable spraying system	Reduces spraying volume; improves application efficiency; saves pesticides	Real-time processing of large point clouds	N/A	76.04% (average coverage)	[141]
Surfel-based Mapping (SuMa)	Dense and real-time mapping with 3D laser; detecting loop closures and optimizing pose graph for consistent maps	Management of environments with few references, distinguishing between static and dynamic objects in ambiguous situations	N/A	1.4% (average translational error)	[142]
LiDAR odometry NDT ISAM2 IMU preintegration	Real-time estimation; robust without GNSS; wpdatable local map	Young forest environment; limited UAVs resources	N/A	APE: 0.2471–0.3123 RPE: 0.1521–0.2646	[143]
Multi-camera fisheye system for 3D perception in autonomous vehicles	$360^{\circ}$ coverage full FOV with few cameras; low cost; automatic calibration; dense and sparse mapping; visual localization; obstacle detection	Fisheye lens distortion; frequent calibration required; fusion of data from multiple cameras	N/A	∼7 cm (mapping) $< 10$ cm (obstacles)	[144]
Online detection of bridge columns using UAVs	Robust recognition of structural components without prior 3D model	Computational optimization for real-time processing	N/A	96.0% (structural components)	[145]
Deep learning-based object detection	Improved accuracy in small object detection	High computational complexity	91.7%	N/A	[146]

N/A: not mentioned.

Table 12. Relevant scientific articles for object detection using UAVs.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
MSA-CenterNet with ResNet50 multiscale fusion and attention	Improved detection of small objects and scale variation	Computational complexity	86.7	89.2	[3]
MOD-YOLO	Efficient multispectral fusion	Speed–memory–accuracy balance	46.0	86.3 (mAP50)	[14]
Faster R-CNN with InceptionResnetV2	Detection and georeferencing of traffic signs in UAVs images	Lack of labeled UAV images and class imbalance	56.30	N/A	[17]
CSP in YOLOv4 neck + SiLU + New detection head + Simplified BiFPN + Coordinated attention	Reduces parameters; improves accuracy; extracts more location information; focuses on spatial relationships	Slight speed reduction	52.76 (VisDrone) 96.98 (S2TLD)	62.71 (VisDrone) 93.35 (S2TLD)	[18]
CeDiRNet (center direction regression)	Support from surrounding pixels; domain-agnostic locator; point annotation	Occlusions between objects; very dense scenes	N/A	98.74 (Acacia-06)	[93]
Modified MaskRCNN with 9 convolutional layers, 4 max-pooling, 1 detection	Real-time change detection; zoom for greater detail	Extended training time	N/A	94.3 (F1-score)	[147]
UAVs + vehicle trajectories	High-quality data; simultaneous capture	Processing large data volume	N/A	MAPE < 6	[148]
ADAS-GPM	Improves small object detection; dynamic label assignment; Gaussian similarity metric	IoU sensitivity for small objects; sample imbalance	27.1	AP50 58.6	[149]

N/A: not mentioned.

Table 13. Relevant scientific articles for model performance evaluation.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
YOLOv5l with rule-based post-processing	Multiple defect detection; pavement condition index; low cost	Limitations with small defects	59	61	[150]
UAS LiDAR flight planning	Point density estimation; LiDAR sensor comparison	Weather factors not considered	N/A	88.9–92.4	[152]
UAS LiDAR + RGB for levee damage detection	High resolution; low cost; frequent acquisition	Variable GRVI threshold; manual verification required	N/A	95–99 (GRVI)	[153]

N/A: not mentioned.

Table 14. Relevant scientific articles on recent challenges and advances in photogrammetry and 3D mapping.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Digital twin of roads using map data	Uses existing data without field survey; follows road engineering representation; detects road components; eliminates defects from low-quality data	Implemented only in flat areas; does not model side ditches	N/A	6.7 [cm]	[73]
CRG and VRG	Automatic surface extraction; efficient processing of large datasets	Sensitive to input parameters; requires adjustment for complex bridges	N/A	0.932–0.998	[74]
ThickSeg	Preserves 3D geometry; efficient; versatile	Loss of details in projection	N/A	53.4 [%] mIoU	[75]
Semantic segmentation of point cloud data using raw laser scanner measurements and deep neural networks	Works with non-georeferenced data; avoids trajectory error issues	Real-time classification; misclassification of branches as trunks; less spatial context than point cloud-based methods	N/A	80.1 [%] mIoU	[76]
PointNet++	Captures local features at multiple scales	Sensitive to the number of input points	N/A	86.75 [%]	[103]
PCIS (point cloud classification based on image-based instance segmentation)	Uses 2D images to classify 3D point cloud	Occlusions and multiclass segmentation	48.41	82.8 [%]	[126]
UAVs photogrammetry + AI for crack detection + digital twin augmented by damage + VR	Remote inspection; automatic detection; interactive 3D visualization	Image quality; environmental conditions; data size	N/A	0.391 [cm]	[154]
Progressive homography with SIFT and RANSAC + Deep-SORT + Kalman + ICP	Does not require GPS; works with crowdsourced videos; projects countable and massive objects	Error accumulation over time; requires initial reference points	74.48	32.7–36.9 [ft]	[155]

N/A: not mentioned.

Table 15. Relevant scientific articles for aerial image segmentation.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Hybrid method combining CHM-based segmentation and point-based clustering with multiscale adaptive LM filter and supervoxel-weighted fuzzy clustering	Superior overall performance compared to existing methods, enhanced computational efficiency, and accurate individual tree segmentation	Parameter adjustment for diverse forest types and tree densities	N/A	88.27	[158]

N/A: not mentioned.

Table 16. Relevant scientific articles for object detection with UAVs and LiDAR.

Techniques	Advantages	Challenges	Evaluation Metrics		Ref.
Techniques	Advantages	Challenges	mAP (%)	Accuracy (%)	Ref.
Radiometric analysis of VLP-32C laser scanner, road marking degradation model based on 3D point cloud intensity, generation of degradation maps	Reliable estimation of retroreflectivity without dedicated equipment, detection of highly degraded areas, intuitive visualization of degradation	Robust segmentation of highly degraded road markings, noise in individual measurements	N/A	N/A	[47]
UAVs–LiDAR and mobile photogrammetric system (MPS) for road geometric parameter extraction	Acquisition of precise longitudinal and cross-sectional profiles, efficient extraction of road geometric parameters, MPS as a cost-effective alternative to LiDAR systems	Non-ground point filtering, temporal synchronization of multiple sensors, processing of large-scale datasets	N/A	N/A	[109]

N/A: not mentioned.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Satama-Bermeo, G.; Lopez-Guede, J.M.; Rahebi, J.; Teso-Fz-Betoño, D.; Boyano, A.; Akizu-Gardoki, O. PRISMA Review: Drones and AI in Inventory Creation of Signage. Drones 2025, 9, 221. https://doi.org/10.3390/drones9030221

AMA Style

Satama-Bermeo G, Lopez-Guede JM, Rahebi J, Teso-Fz-Betoño D, Boyano A, Akizu-Gardoki O. PRISMA Review: Drones and AI in Inventory Creation of Signage. Drones. 2025; 9(3):221. https://doi.org/10.3390/drones9030221

Chicago/Turabian Style

Satama-Bermeo, Geovanny, Jose Manuel Lopez-Guede, Javad Rahebi, Daniel Teso-Fz-Betoño, Ana Boyano, and Ortzi Akizu-Gardoki. 2025. "PRISMA Review: Drones and AI in Inventory Creation of Signage" Drones 9, no. 3: 221. https://doi.org/10.3390/drones9030221

APA Style

Satama-Bermeo, G., Lopez-Guede, J. M., Rahebi, J., Teso-Fz-Betoño, D., Boyano, A., & Akizu-Gardoki, O. (2025). PRISMA Review: Drones and AI in Inventory Creation of Signage. Drones, 9(3), 221. https://doi.org/10.3390/drones9030221

Article Menu

PRISMA Review: Drones and AI in Inventory Creation of Signage

Abstract

1. Introduction

2. Systematic Review Methodology

2.1. Identification

2.2. Study Review

2.2.1. Inclusion and Exclusion Criteria

2.2.2. Application of Criteria

3. State of the Art

3.1. UAVs for Signage Detection

3.2. Applications of UAVs for Road Safety

3.3. Automated Inventory Systems in Road Infrastructure

3.4. Detection Technologies

3.5. Surface and Object Detection in Urban Applications

3.6. Advances in Small and Hidden Object Detection

3.7. Semantic Segmentation for Object Detection

3.8. Multisensor Data Fusion

3.9. Data Fusion and Automatic Registration in Urban Applications

4. Quantitative Results

4.1. Performance of Detection Algorithms

4.2. Comparison Between Segmentation Techniques

4.3. Identified Quantitative Challenges

4.4. Data Extraction from Relevant Studies

4.4.1. Application of DLMs in Traffic Sign Detection and Classification

4.4.2. Segmentation and Processing of Aerial Images

4.4.3. Integration of Sensors and Multisensor Systems

4.4.4. SLAM (Simultaneous Localization and Mapping)

4.4.5. Object Detection Techniques Using UAVs

4.4.6. Performance Evaluation of Detection and Classification Models

4.4.7. Challenges and Recent Advances in Photogrammetry and 3D Mapping

4.4.8. Aerial Image Segmentation

4.4.9. Object Detection with UAVs and LiDAR

5. Discussion

5.1. Comparative Efficiency of AI Algorithms in Road Signage Detection

5.2. Impact and Practical Applications of UAV and Advanced Sensor Integration

5.3. Future Directions for the Optimization and Adoption of UAVs in Automated Road Management

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI