Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (423)

Search Parameters:
Keywords = aerial photography

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 4429 KB  
Article
SDP-YOLOv8: A Lightweight Enhancement Algorithm for Small Object Detection in UAV Aerial Photography
by You-Chao Lu, Yi-Han Xu, Wen Zhou and Ding Zhou
Appl. Sci. 2026, 16(10), 4941; https://doi.org/10.3390/app16104941 - 15 May 2026
Viewed by 140
Abstract
To overcome the limitations of existing UAV object detection algorithms—particularly missed detections, false alarms, and the progressive loss of fine-grained features for small objects—this paper proposes SDP-YOLOv8, a lightweight and parameter-efficient enhancement of YOLOv8. The design aims to improve small-object detection accuracy while [...] Read more.
To overcome the limitations of existing UAV object detection algorithms—particularly missed detections, false alarms, and the progressive loss of fine-grained features for small objects—this paper proposes SDP-YOLOv8, a lightweight and parameter-efficient enhancement of YOLOv8. The design aims to improve small-object detection accuracy while maintaining a lightweight architecture suitable for deployment on memory-constrained UAV platforms. Four lightweight-oriented modifications are introduced: (1) SCFS, which combines SPD-Conv for low-information-loss downsampling with a C2f block and SimAM attention; (2) DCSPPF, expanding the receptive field via parallel dilated convolutions; (3) a GhostConv-infused Patch Merging upsampling layer for local context enhancement; and (4) an extra small-scale detection head to preserve fine details. On VisDrone2019, experimental results show that SDP-YOLOv8 improved mAP@0.5 by 3.90% and mAP@0.5:0.95 by 2.60%, with a 14.4% reduction in parameters. The model maintains real-time performance (53.5 FPS on an RTX 3090 at FP32 with batch size 1, 38.7 FPS on a Jetson Orin Nano with TensorRT FP16 at batch size 1) and offers a favorable trade-off between detection accuracy, parameter efficiency, and memory footprint, making it a potential candidate for onboard deployment on resource-limited UAVs in aerial monitoring scenarios, pending further validation on diverse datasets and hardware platforms. Full article
Show Figures

Figure 1

20 pages, 3843 KB  
Article
UAV-Assisted Pesticide Application in Potato Cultivation Under Waterlogged Soil Conditions: Orthophotomap-Based Monitoring and Field Assessment
by Andrey Ronzhin, Artem Ryabinov, Elena Shkodina, Anton Saveliev, Ekaterina Cherskikh and Aleksandra Figurek
Sustainability 2026, 18(9), 4567; https://doi.org/10.3390/su18094567 - 6 May 2026
Viewed by 393
Abstract
The operational limitations of ground-based machinery in waterlogged soils of northern regions often lead to missed agronomic treatments, resulting in substantial yield losses. This problem is important in potato production, where a delay in disease protection can quickly lead to loss of leaf [...] Read more.
The operational limitations of ground-based machinery in waterlogged soils of northern regions often lead to missed agronomic treatments, resulting in substantial yield losses. This problem is important in potato production, where a delay in disease protection can quickly lead to loss of leaf mass and reduced yield. This article evaluates an integrated UAV-based approach for crop production in potato cultivation, encompassing aerial soil analysis, weed segmentation, targeted spraying, and yield prediction. A field experiment was designed using a developed GD-4 drone on a 300 m × 6 m test plot. Aerial photography was used to generate an orthophotomap for monitoring and planning pesticide applications. The UAV, operating at a 2-m altitude, achieved a 3-m spray swath, enabling complete plot coverage. Visual assessment confirmed superior plant health in the test plot compared to the control. Quantitative analysis revealed a yield of 39.06 t/ha in the test plot, a 15.2% increase over the control plot (33.91 t/ha), with a comparable percentage of marketable tubers (94.5% vs. 93.3%). The study concludes that UAV technology is a reliable means of remote sensing and offers an alternative for ensuring timely agricultural operations and enhancing yield in inaccessible terrains. Full article
Show Figures

Figure 1

23 pages, 4374 KB  
Article
EFPN-YOLO: A Method for Small Target Detection in Unmanned Aerial Vehicles
by Yimeng Li, Wanwen Yi, Tingyi Zhang and Jun Wang
Appl. Sci. 2026, 16(9), 4526; https://doi.org/10.3390/app16094526 - 4 May 2026
Viewed by 297
Abstract
In drone aerial photography applications, small object detection is crucial. For instance, it enables locating missing individuals on the ground during search-and-rescue operations, identifying distant vehicles in traffic monitoring, and detecting early-stage pest infestations in agricultural fields. However, aerial images present a unique [...] Read more.
In drone aerial photography applications, small object detection is crucial. For instance, it enables locating missing individuals on the ground during search-and-rescue operations, identifying distant vehicles in traffic monitoring, and detecting early-stage pest infestations in agricultural fields. However, aerial images present a unique challenge: due to the high flight altitude of drones, targets occupy only a minimal pixel area. Combined with complex backgrounds and sparse features, small objects are easily obscured by surrounding environments. To address these issues, this paper proposes the EFPN-YOLO model based on YOLOv12n. First, we introduce the Feature-Sharing Convolution (FSConv) module, which extracts multi-scale features with low parameter requirements through shared convolution kernels and multi-scale sparse sampling. Second, by integrating deformable convolutions with a dual-channel attention mechanism, we develop the Enhanced Dual-Dimensional Calibration (EDDC) module, significantly improving spatial feature modeling capabilities and feature enhancement effectiveness. Finally, we construct the RC-FPN architecture, employing a bidirectional fusion structure and diagonal cross-layer skip connections to minimize information loss. Meanwhile, the Bottleneck structure in the C3K2 module is replaced with the RepViTBlock to construct the C3k2_RVB module, which enhances the multi-scale feature expression ability through a two-stage design of spatial and channel mixing. On the VisDrone2019 dataset, the model’s mAP50 improved from 33.9% to 40.7%; on the TinyPerson dataset, it rose from 13.9% to 19.2%; and on the NVIDIA Jetson Orin Nano 8 GB superplatform, the model achieved a frame rate (FPS) of 15. Experiments demonstrate that EFPN-YOLO excels in small object detection and holds significant practical value. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

31 pages, 33887 KB  
Article
Deep Learning-Based Waterline Detection Applied to Wave Period Measurement in the Nearshore Swash Zone
by Laurence Zsu-Hsin Chuang, Po-An Tsai and Mei-Huei Chen
Remote Sens. 2026, 18(9), 1385; https://doi.org/10.3390/rs18091385 - 30 Apr 2026
Viewed by 252
Abstract
This study proposes an integrated framework combining aerial photography of unmanned aerial vehicle (UAV), AI-based waterline detection, and a rigorous quality control (QC) scheme for estimating wave periods in the swash zone. The proposed approach automatically extracts instantaneous waterlines from high-resolution UAV videos [...] Read more.
This study proposes an integrated framework combining aerial photography of unmanned aerial vehicle (UAV), AI-based waterline detection, and a rigorous quality control (QC) scheme for estimating wave periods in the swash zone. The proposed approach automatically extracts instantaneous waterlines from high-resolution UAV videos and converts them into wave series using timestack analysis. The DeepUNet model achieved a pixel-level recognition score of 75.0% for both F1-score and Dice, demonstrating reliable performance in detecting thin waterline features. The integration of spatial and temporal QC further improves the robustness of waterline tracking and reduces false detections. Wave periods derived from wave series across different cross-sections in the swash zone exhibit spatial consistent and qualitative consistency when contextually compared with offshore data buoy observations, while the quantitative differences reflect variation in nearshore wave dynamics. These results confirm the feasibility and effectiveness of the proposed framework for high-resolution nearshore wave monitoring. Full article
Show Figures

Figure 1

25 pages, 53027 KB  
Article
Failure Mechanism of Sudden Rock Landslide Under the Coupling Effect of Hydrological and Geological Conditions: A Case Study of the Wanshuitian Landslide, China
by Pengmin Su, Maolin Deng, Long Chen, Biao Wang, Qingjun Zuo, Shuqiang Lu, Yuzhou Li and Xinya Zhang
Water 2026, 18(9), 1001; https://doi.org/10.3390/w18091001 - 23 Apr 2026
Viewed by 471
Abstract
At around 8:40 a.m. on 17 July 2024, the Wanshuitian landslide in the Three Gorges Reservoir Area (TGRA) experienced a deformation failure characterized by thrust load-caused deformations and high-speed sliding. Using geological surveys and unmanned aerial vehicle (UAV) photography, this study divided the [...] Read more.
At around 8:40 a.m. on 17 July 2024, the Wanshuitian landslide in the Three Gorges Reservoir Area (TGRA) experienced a deformation failure characterized by thrust load-caused deformations and high-speed sliding. Using geological surveys and unmanned aerial vehicle (UAV) photography, this study divided the Wanshuitian landslide area into five zones: sliding initiation (A1), secondary disintegration (A2), main accumulation (B1), right falling (B2), and left falling (B3) zones. Through monitoring data analysis and GeoStudio-based numerical simulations, this study revealed the mechanisms behind the landslide failure mode characterized by slope sliding approximately along the strike of the rock formation under the coupling effect of hydrological and geological conditions. The results indicate that factors inducing the landslide failure include the geomorphic feature of alternating grooves and ridges, the lithologic assemblage characterized by interbeds of soft and hard rocks, the slope structure with well-developed joints, and the sustained heavy rains in the preceding period. In the Wanshuitian landslide area, mudstone valleys are prone to accumulate rainwater, which can infiltrate directly into the weak interlayers of rock masses and soften the rock masses. Multi-peak rain events with a short time interval serve as a critical factor in groundwater recharge. Within 17 days preceding its failure, the Wanshuitian landslide experienced a superimposed process of heavy and secondary rain events with a short interval (four days). Rainwater from the first heavy rain event failed to completely discharge during the short interval, while the secondary rain event also caused rainwater accumulation. These led to a continuous rise in the groundwater table, a constant decrease in the shear strength of the slope, and ultimately the landslide instability. Since the landslide sliding in the dip direction of the rock formation was impeded, the main sliding direction of the landslide formed an angle of 88° with this direction. This led to a unique failure mode characterized by slope sliding approximately along the strike of the rock formation. Based on these findings, this study proposed characteristics for the early identification of the failure of similar landslides, aiming to provide a robust scientific basis for the monitoring, early warning, and prevention and control of the failure of similar landslides. Full article
(This article belongs to the Special Issue Water-Related Landslide Hazard Process and Its Triggering Events)
Show Figures

Figure 1

7 pages, 2729 KB  
Proceeding Paper
Unmanned Aerial Vehicles Aerial Photography Combined with Building Information Modeling Applied in Road Landscape Planning Research
by Ren-Jwo Tsay
Eng. Proc. 2026, 134(1), 9; https://doi.org/10.3390/engproc2026134009 - 30 Mar 2026
Viewed by 339
Abstract
In road planning and landscape design, data collection emphasizes existing site conditions, particularly in projects involving modifications rather than new construction, as such data directly inform subsequent planning decisions. Beyond conventional surveying techniques, large-scale street-region digital elevation models can be generated using aerial [...] Read more.
In road planning and landscape design, data collection emphasizes existing site conditions, particularly in projects involving modifications rather than new construction, as such data directly inform subsequent planning decisions. Beyond conventional surveying techniques, large-scale street-region digital elevation models can be generated using aerial imagery acquired from unmanned aerial vehicles. The point clouds derived from these aerial photographs provide a basis for constructing spatial models applicable to street landscape and road planning. In this study, aerial data were processed using Pix4D software 4.8.4 to generate the initial spatial model, which was subsequently integrated into a building information modeling-based design framework in Autodesk Revit 2022. This approach enabled rapid and precise design outputs, while the resulting BIM model was further applied to mapping applications to establish a foundational database for regional public works. Full article
Show Figures

Figure 1

20 pages, 5966 KB  
Article
Target Recognition Model for Seedling Sugar Beets from UAV Aerial Imagery
by Meijuan Cheng, Yuankai Chen, Yu Deng, Zhixiong Zeng, Jiahui Song, Xiao Wu, Jie Liu, Zhen Yin and Zhigang Zhang
Agriculture 2026, 16(7), 737; https://doi.org/10.3390/agriculture16070737 - 26 Mar 2026
Viewed by 482
Abstract
The extensive cultivation scale of sugar beet seedlings has resulted in the necessity for accurate identification and monitoring of the seedling count, a task which has become crucial and highly challenging in the sugar industry. However, sugar beet seedlings in UAV aerial photography [...] Read more.
The extensive cultivation scale of sugar beet seedlings has resulted in the necessity for accurate identification and monitoring of the seedling count, a task which has become crucial and highly challenging in the sugar industry. However, sugar beet seedlings in UAV aerial photography scenarios are mostly small targets with complex backgrounds. Existing general detection models not only have insufficient detection accuracy, but also struggle to balance computational efficiency and resource consumption. To meet the practical needs of field monitoring, this paper proposes the LDH-RTDETR, a sugar beet seedling detection model that balances high accuracy and light weight. This model uses LSNet for feature extraction to reduce size, adds a deformable attention (DAttention) module to capture fine-grained seedling features, and adopts HS-FPN to improve multi-scale feature fusion in the neck network. Experimental results show that the improved model significantly outperforms the original RT-DETR model, with a 3.6% increase in accuracy, a 2.1% increase in mAP50, a recall rate of 86.0%, and a final model size of only 43.3 MB, thus achieving an effective balance between accuracy and model size. This study’s improved model offers an efficient solution for large-area identification and counting of sugar beet seedlings, and is highly significant for advancing the automation of sugar crop field management and agricultural digital transformation. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

24 pages, 9340 KB  
Article
Engineering-Induced Extension of Deep-Seated Landslide at a Tunnel Portal on the Northeastern of the Qinghai–Tibet Plateau
by Guifei Huang, Lichun Chen, Minghua Hou, Dexian Liang, Ruidong Liu, Renmao Yuan and Lize Chen
Appl. Sci. 2026, 16(6), 2696; https://doi.org/10.3390/app16062696 - 11 Mar 2026
Viewed by 496
Abstract
Human engineering activity, such as cross-regional transportation construction, often disturbs the geological environment and triggers landslides. This study investigated a landslide induced by tunnel excavation in the northeastern region of the Qinghai–Tibet Plateau, exploring how a seemingly low-risk local small-scale landslide can trigger [...] Read more.
Human engineering activity, such as cross-regional transportation construction, often disturbs the geological environment and triggers landslides. This study investigated a landslide induced by tunnel excavation in the northeastern region of the Qinghai–Tibet Plateau, exploring how a seemingly low-risk local small-scale landslide can trigger an engineering disaster. Based on field geological and geomorphological surveys, unmanned aerial vehicle (UAV) remote sensing photography, and SBAS-InSAR data analysis (time-series monitoring from 2021 to 2023), the spatiotemporal evolution patterns and causative mechanisms of landslide deformation were systematically elucidated. The results indicate the following: (1) The landslide evolved from initial multiple small local slides, gradually expanding and connecting to form a larger and deep-seated landslide. (2) SBAS-InSAR analysis revealed that the landslide deformation rate ranged from −38.13 to 12.01 mm/a, with a maximum cumulative deformation of 121.91 mm. Substantial deformation was concentrated in April–June 2021, June–August 2022, and April–July 2023. Spatially, the deformation intensity exhibited a pattern of middle section > front > rear, with greater deformation closer to the tunnel construction point. (3) The landslide deformation is primarily related to tunnel construction disturbance. The topography, geological structure, and frozen ground thawing exerted certain influences. The deformation mechanism is summarized as follows: Slope toe excavation initially triggers local sliding, leading to tension cracking at the rear edge. Subsequently, tunnel construction further promotes landslide expansion, resulting in the formation of a deep-seated landslide. This study showed that the landslide resulted from the combined effects of engineering activity and natural conditions. The results reveal that, under disturbances from inappropriate engineering activities, local small landslides may develop into major disasters. Therefore, the construction plan for the tunnel must be revised to mitigate such risks. Full article
Show Figures

Figure 1

25 pages, 10745 KB  
Article
Super-Resolution Remote Sensing Datasets for Application to Caral–Supe Archeological Sites Employing SAR and DEMs
by Jungrack Kim and Ramesh P. Singh
Remote Sens. 2026, 18(6), 854; https://doi.org/10.3390/rs18060854 - 10 Mar 2026
Viewed by 680
Abstract
Publicly accessible spaceborne remote sensing datasets often lack the spatial resolution required to reliably distinguish archeological features from their surrounding geomorphological contexts. In this study, we assess the potential of super-resolution (SR) products derived from multiple public-domain remote sensing datasets for a systematic [...] Read more.
Publicly accessible spaceborne remote sensing datasets often lack the spatial resolution required to reliably distinguish archeological features from their surrounding geomorphological contexts. In this study, we assess the potential of super-resolution (SR) products derived from multiple public-domain remote sensing datasets for a systematic archeological survey in the Caral–Supe region. We focus on Synthetic Aperture Radar (SAR) and topographic datasets—including Sentinel-1, Advanced Land Observing Satellite (ALOS) Phased Array L-band Synthetic Aperture Radar (PALSAR), and Digital Elevation Models (DEMs)—because of their capacity to detect subtle surface expressions and shallow subsurface structures obscured by vegetation or sediment cover. Using state-of-the-art deep learning algorithms, primarily employing the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) architecture, we integrated multi-source SAR imagery and DEM data to generate SR products that reveal distinct signatures in areas containing dense archeological remains and clearly delineate shallow, buried anthropogenic features. We further developed deep learning classification models that combine SR SAR and DEM inputs and trained them on known archeological site locations. This approach enabled the detection of previously undocumented structural features distributed along the coastal margin and throughout the Supe Valley. Our findings indicate that enhancing publicly available remote sensing datasets with advanced SR techniques can provide cost-effective and practical high-resolution archeological data, compared to data mining using aerial photography and high-resolution commercial satellite imagery, in terms of both cost and obstacle penetration. Full article
Show Figures

Figure 1

25 pages, 11205 KB  
Article
Remote Sensing Image Captioning via Self-Supervised DINOv3 and Transformer Fusion
by Maryam Mehmood, Ahsan Shahzad, Farhan Hussain, Lismer Andres Caceres-Najarro and Muhammad Usman
Remote Sens. 2026, 18(6), 846; https://doi.org/10.3390/rs18060846 - 10 Mar 2026
Viewed by 1202
Abstract
Effective interpretation of coherent and usable information from aerial images (e.g., satellite imagery or high-altitude drone photography) can greatly reduce human effort in many situations, both natural (e.g., earthquakes, forest fires, tsunamis) and man-made (e.g., highway pile-ups, traffic congestion), particularly in disaster management. [...] Read more.
Effective interpretation of coherent and usable information from aerial images (e.g., satellite imagery or high-altitude drone photography) can greatly reduce human effort in many situations, both natural (e.g., earthquakes, forest fires, tsunamis) and man-made (e.g., highway pile-ups, traffic congestion), particularly in disaster management. This research proposes a novel encoder–decoder framework for captioning of remote sensing images that integrates self-supervised DINOv3 visual features with a hybrid Transformer–LSTM decoder. Unlike existing approaches that rely on supervised CNN-based encoders (e.g., ResNet, VGG), the proposed method leverages DINOv3’s self-supervised learning capabilities to extract dense, semantically rich features from aerial images without requiring domain-specific labeled pretraining. The proposed hybrid decoder combines Transformer layers for global context modeling with LSTM layers for sequential caption generation, producing coherent and context-aware descriptions. Feature extraction is performed using the DINOv3 model, which employs the gram-anchoring technique to stabilize dense feature maps. Captions are generated through a hybrid of Transformer with Long Short-Term Memory (LSTM) layers, which adds contextual meaning to captions through sequential hidden layer modeling with gated memory. The model is first evaluated on two traditional remote sensing image captioning datasets: RSICD and UCM-Captions. Multiple evaluation metrics like Bilingual Evaluation Understudy (BLEU), Consensus-based Image Description Evaluation (CIDEr), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L), and Metric for Evaluation of Translation with Explicit Ordering (METEOR), are used to quantify the performance and robustness of the proposed DINOv3 hybrid model. The proposed model outperforms conventional Convolutional Neural Network (CNN) and Vision Transformers (ViT)-based models by approximately 9–12% across most evaluation metrics. Attention heatmaps are also employed to qualitatively validate the proposed model when identifying and describing key spatial elements. In addition, the proposed model is evaluated on advanced remote sensing datasets, including RSITMD, DisasterM3, and GeoChat. The results demonstrate that self-supervised vision transformers are robust encoders for multi-modal understanding in remote sensing image analysis and captioning. Full article
Show Figures

Figure 1

23 pages, 5285 KB  
Article
An Exploratory Analysis of Geometric Alignments on Lane Departure Behaviors at Loop Ramps
by Ting Ge, Zhuying Dai, Yuhan Wang, Sen Cai, Zeyang Li and Xiaomeng Wang
Appl. Sci. 2026, 16(5), 2582; https://doi.org/10.3390/app16052582 - 8 Mar 2026
Viewed by 427
Abstract
Lane departure can cause lateral vehicle collisions and, in severe cases, lead to vehicles running off the road. Such incidents often occur on curved sections and ramps. This study focuses on loop ramps. To quantify the impact of geometric alignment characteristics of loop [...] Read more.
Lane departure can cause lateral vehicle collisions and, in severe cases, lead to vehicles running off the road. Such incidents often occur on curved sections and ramps. This study focuses on loop ramps. To quantify the impact of geometric alignment characteristics of loop ramps on lane departure behaviors, unmanned aerial vehicle (UAVs) aerial photography was used to collect operation videos of 10 loop ramps at 6 interchanges, and 762 pieces of vehicle trajectory data under free-flow conditions were extracted based on DataFromSky. Combined with the indicators of equivalent radius and trajectory design curvature difference, vehicle trajectories were systematically classified into three patterns via k-means clustering: in the direction of centrifugal force (IDCF), against the direction of centrifugal force (ADCF), and no-offset normal driving (NOND). A multinomial logistic regression model was constructed to analyze the influence of loop ramp geometric alignment characteristics on departure behaviors. The results show that for the horizontal alignment elements of loop ramps, an increase in circular curve radius, a decrease in circular curve length, and a decrease in the length of the transition curve entering the circular curve all increase the risk of IDCF; conversely, the increase in these geometric parameters tend to increase the risk of ADCF. For the vertical alignment elements, there is a significant nonlinear negative correlation between the adjacent maximum gradient difference and lane departure behaviors. For the cross-section of loop ramps, widening can significantly suppress the risk of IDCF but slightly increase the risk of ADCF. This study reveals the synergistic influence mechanism of the three-dimensional (horizontal, vertical, and cross-sectional) geometric characteristics of combined alignments on lane departure behaviors at interchange loop ramps. Full article
Show Figures

Figure 1

16 pages, 5863 KB  
Article
A Rapid Aerial Image Mosaic Method for Multiple Drones Based on Key Frames
by Xiuzhen Wu, Yahui Qi, Liang Qin, Shi Yan and Jianxiu Zhang
Automation 2026, 7(2), 43; https://doi.org/10.3390/automation7020043 - 5 Mar 2026
Viewed by 578
Abstract
Due to their advantages of being low-cost, lightweight and flexible, and having wide shooting coverage, UAVs have played an important role in situational awareness in the fields of disaster prevention and mitigation, urban planning and management, etc. In these applications, UAV aerial photography [...] Read more.
Due to their advantages of being low-cost, lightweight and flexible, and having wide shooting coverage, UAVs have played an important role in situational awareness in the fields of disaster prevention and mitigation, urban planning and management, etc. In these applications, UAV aerial photography is limited by the field of view, and high-definition panoramic images of the complete target area cannot be obtained. Image mosaic technology is essential, but an image mosaic using only a single UAV cannot meet the high real-time requirements for situational awareness. In response to the above problems, this paper proposes a multi-UAV fast aerial image mosaic method based on key frames. First, the multi-UAV area coverage flight strategy is determined according to the size of the task area and the UAV flight parameters; then, the field of view of the pod, the flight speed, and the flight altitude are used to determine the key frame extraction time period during the UAV aerial photography process. The image matching-rate calculation method is designed and the key frames are extracted during the extraction time period, and the key frames are returned to the ground visual puzzle system; in the ground visual puzzle system, the improved Laplacian pyramid method is used to quickly fuse and stitch the key frames extracted by each UAV to obtain a panoramic stitched map. The experiment shows that the method can quickly obtain high-precision real-scene map information of the task area. Compared with the single-UAV method and the multi-UAV full video stream-splicing method, this method greatly reduces the consumption of computing power and the requirements of communication bandwidth and improves the efficiency and real-time performance of panoramic map acquisition. Full article
Show Figures

Figure 1

18 pages, 11148 KB  
Article
YOLO-DSNet for Small Target Detection
by Haokun Xu, Huangleshuai He, Qike Zhi, Zhengyi Yang and Bocheng Han
Appl. Sci. 2026, 16(3), 1493; https://doi.org/10.3390/app16031493 - 2 Feb 2026
Viewed by 780
Abstract
Small target detection in Unmanned Aerial Vehicle (UAV) applications is often plagued by inherent challenges such as small object sizes, sparse information, and complex background interference. Traditional detection algorithms and existing YOLO series models suffer from limitations in detection accuracy and fine-grained detail [...] Read more.
Small target detection in Unmanned Aerial Vehicle (UAV) applications is often plagued by inherent challenges such as small object sizes, sparse information, and complex background interference. Traditional detection algorithms and existing YOLO series models suffer from limitations in detection accuracy and fine-grained detail preservation. To address this, this paper proposes YOLO-DSNet, a small target detection network based on YOLOv13n. First, we introduce the dual-stream attention module (DSAM), which enhances discriminative features by leveraging bidirectional context modeling. Second, we design the Multi-scale Attention C2f (MSA-C2f) module—an adaptive architecture that optimizes feature extraction via multi-scale enhancement, effectively preserving and integrating small target information. Finally, through dataset augmentation, we significantly improve the model’s detection performance. The proposed YOLO-DSNet achieves a mAP@0.5 improvement from 30.8% to 40.1% on the VisDrone2019 dataset with only 0.8 million additional parameters, yielding a 30% accuracy gain while increasing computational overhead by merely 11.6 Gigaflops (GFLOPs). Experiments demonstrate YOLO-DSNet’s effectiveness in small target detection tasks such as UAV aerial photography and remote sensing imagery, successfully balancing accuracy and efficiency with high practical value. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 5027 KB  
Article
Symmetry-Enhanced YOLOv8s Algorithm for Small-Target Detection in UAV Aerial Photography
by Zhiyi Zhou, Chengyun Wei, Lubin Wang and Qiang Yu
Symmetry 2026, 18(1), 197; https://doi.org/10.3390/sym18010197 - 20 Jan 2026
Viewed by 543
Abstract
In order to solve the problems of small-target detection in UAV aerial photography, such as small scale, blurred features and complex background interference, this article proposes the ACS-YOLOv8s method to optimize the YOLOv8s network: notably, most small man-made targets in UAV aerial scenes [...] Read more.
In order to solve the problems of small-target detection in UAV aerial photography, such as small scale, blurred features and complex background interference, this article proposes the ACS-YOLOv8s method to optimize the YOLOv8s network: notably, most small man-made targets in UAV aerial scenes (e.g., small vehicles, micro-drones) inherently possess symmetry, a key geometric attribute that can significantly enhance the discriminability of blurred or incomplete target features, and thus symmetry-aware mechanisms are integrated into the aforementioned improved modules to further boost detection performance. The backbone network introduces an adaptive feature enhancement module, the edge and detail representation of small targets is enhanced by dynamically modulating the receptive field with deformable attention while also capturing symmetric contour features to strengthen the perception of target geometric structures; a cascaded multi-receptive field module is embedded at the end of the trunk to integrate multi-scale features in a hierarchical manner to take into account both expressive ability and computational efficiency with a focus on fusing symmetric multi-scale features to optimize feature representation; the neck is integrated with a spatially adaptive feature modulation network to achieve dynamic weighting of cross-layer features and detail fidelity and, meanwhile, models symmetric feature dependencies across channels to reduce the loss of discriminative information. Experimental results based on the VisDrone2019 data set show that ACS-YOLOv8s is superior to the baseline model in precision, recall, and mAP indicators, with mAP50 increased by 2.8% to 41.6% and mAP50:90 increased by 1.9% to 25.0%, verifying its effectiveness and robustness in small-target detection in complex drone aerial-photography scenarios. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

18 pages, 7295 KB  
Article
Study on Right-Turning Vehicles’ Yielding Behavior for Crossing E-Bikes at Signalized Intersections
by Ting Ge, Tingting Hao, Sen Cai and Xiaomeng Wang
Urban Sci. 2026, 10(1), 55; https://doi.org/10.3390/urbansci10010055 - 16 Jan 2026
Viewed by 1106
Abstract
This study aimed to explore the factors influencing right-turning vehicles’ yielding behavior for crossing e-bikes at signalized intersections to improve safety for crossing e-bikes. Videos of different intersections were obtained through manual video recording and drone aerial photography. Spatiotemporal information data for right-turning [...] Read more.
This study aimed to explore the factors influencing right-turning vehicles’ yielding behavior for crossing e-bikes at signalized intersections to improve safety for crossing e-bikes. Videos of different intersections were obtained through manual video recording and drone aerial photography. Spatiotemporal information data for right-turning vehicles and straight-through e-bikes were extracted through Tracker 6.0 software. Right-turning vehicle yielding decisions were categorized into three types: no yielding, decelerating to yield, and stopping to yield. Five potential variables influencing yielding decisions were selected: personal attributes of e-bike riders, traffic characteristics of e-bikes, traffic characteristics of right-turning vehicles, road characteristics, and right-turning vehicle–e-bike interaction influence characteristics. A multiple ordered logistic regression model was established to predict right-turn vehicle yielding decisions. Simultaneously calculating the OR (Odds Ratio) value reveals the likelihood of increased yielding probability under varying factors. For every one-unit increase in the number of crossing e-bikes, the yielding probability increases to 1.002 times the original value; for every one-unit increase in the average speed of right-turning vehicles, the yielding probability decreases to 0.406 times the original value; for every one-unit increase in the average crossing speed of e-bikes, the yielding probability increases to 1.737 times the original value. Compared with the straight + right-turn lane, a dedicated right-turning lane increases the yielding probability of right-turning vehicles to 4.2 times, and compared with not occupying a crosswalk, illegally occupying a crosswalk decreases the yielding probability of right-turning vehicles to 0.356 times. These findings offer valuable insights for enhancing the safety of e-bikes crossing signal-controlled intersections. Full article
(This article belongs to the Special Issue Urban Traffic Control and Innovative Planning)
Show Figures

Figure 1

Back to TopTop