Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (65)

Search Parameters:
Keywords = CSPDarknet53

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3920 KB  
Article
HCDFI-YOLOv8: A Transmission Line Ice Cover Detection Model Based on Improved YOLOv8 in Complex Environmental Contexts
by Lipeng Kang, Feng Xing, Tao Zhong and Caiyan Qin
Sensors 2025, 25(17), 5421; https://doi.org/10.3390/s25175421 - 2 Sep 2025
Viewed by 155
Abstract
When unmanned aerial vehicles (UAVs) perform transmission line ice cover detection, it is often due to the variable shooting angle and complex background environment, which leads to difficulties such as poor ice-covering recognition accuracy and difficulty in accurately identifying the target. To address [...] Read more.
When unmanned aerial vehicles (UAVs) perform transmission line ice cover detection, it is often due to the variable shooting angle and complex background environment, which leads to difficulties such as poor ice-covering recognition accuracy and difficulty in accurately identifying the target. To address these issues, this study proposes an improved icing detection model based on HCDFI–You Only Look Once version 8 (HCDFI-YOLOv8). First, a cross-dense hybrid (CDH) parallel heterogeneous convolutional module is proposed, which can not only improve the detection accuracy of the model, but also effectively alleviate the problem of the surge in the number of floating-point operations during the improvement of the model. Second, deep and shallow feature weighted fusion using improved CSPDarknet53 to 2-Stage FPN_Dynamic Feature Fusion (C2f_DFF) module is proposed to reduce feature loss in neck networks. Third, optimization of the detection head using the feature adaptive spatial feature fusion (FASFF) detection head module is performed to enhance the model’s ability to extract features at different scales. Finally, a new inner-complete intersection over union (Inner_CIoU) loss function is introduced to solve the contradiction of the CIOU loss function used in the original YOLOv8. Experimental results demonstrate that the proposed HCDFI-YOLOv8 model achieves a 2.7% improvement in mAP@0.5 and a 2.5% improvement in mAP@0.5:0.95 compared to standard YOLOv8. Among twelve models for icing detection, the proposed model delivers the highest overall detection accuracy. The accuracy of the HCDFI-YOLOv8 model in detecting complex transmission line environments is verified and effective technical support is provided for transmission line ice cover detection. Full article
(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems—2nd Edition)
Show Figures

Figure 1

21 pages, 3089 KB  
Article
Lightweight SCL-YOLOv8: A High-Performance Model for Transmission Line Foreign Object Detection
by Houling Ji, Xishi Chen, Jingpan Bai and Chengjie Gong
Sensors 2025, 25(16), 5147; https://doi.org/10.3390/s25165147 - 19 Aug 2025
Viewed by 598
Abstract
Transmission lines are widely distributed in complex environments, making them susceptible to foreign object intrusion, which could lead to serious consequences, i.e., power outages. Currently, foreign object detection on transmission lines is primarily conducted through UAV-based field inspections. However, the captured data must [...] Read more.
Transmission lines are widely distributed in complex environments, making them susceptible to foreign object intrusion, which could lead to serious consequences, i.e., power outages. Currently, foreign object detection on transmission lines is primarily conducted through UAV-based field inspections. However, the captured data must be transmitted back to a central facility for analysis, resulting in low efficiency and the inability to perform real-time, industrial-grade detection. Although recent YOLO series models can be deployed on UAVs for object detection, these models’ substantial computational requirements often exceed the processing capabilities of UAV platforms, limiting their ability to perform real-time inference tasks. In this study, we propose a novel lightweight detection algorithm, SCL-YOLOv8, which is based on the original YOLO model. We introduce StarNet to replace the CSPDarknet53 backbone as the feature extraction network, thereby reducing computational complexity while maintaining high feature extraction efficiency. We design a lightweight module, CGLU-ConvFormer, which enhances multi-scale feature representation and local feature extraction by integrating convolutional operations with gating mechanisms. Furthermore, the detection head of the original YOLO model is improved by introducing shared convolutional layers and group normalization, which helps reduce redundant computations and enhances multi-scale feature fusion. Experimental results demonstrate that the proposed model not only improves the detection accuracy but also significantly reduces the number of model parameters. Specifically, SCL-YOLOv8 achieves a mAP@0.5 of 94.2% while reducing the number of parameters by 56.8%, FLOPS by 45.7%, and model size by 50% compared with YOLOv8n. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

34 pages, 9218 KB  
Article
SC-YOLO: A Real-Time CSP-Based YOLOv11n Variant Optimized with Sophia for Accurate PPE Detection on Construction Sites
by Teerapun Saeheaw
Buildings 2025, 15(16), 2854; https://doi.org/10.3390/buildings15162854 - 12 Aug 2025
Viewed by 590
Abstract
Despite advances in YOLO-based PPE detection, existing approaches primarily focus on architectural modifications. However, these approaches overlook second-order optimization methods for navigating complex loss landscapes in object detection. This study introduces SC-YOLO, integrating CSPDarknet backbone with Sophia optimization (leveraging efficient Hessian estimates for [...] Read more.
Despite advances in YOLO-based PPE detection, existing approaches primarily focus on architectural modifications. However, these approaches overlook second-order optimization methods for navigating complex loss landscapes in object detection. This study introduces SC-YOLO, integrating CSPDarknet backbone with Sophia optimization (leveraging efficient Hessian estimates for curvature-aware updates) for enhanced PPE detection on construction sites. The proposed methodology includes three key steps: (1) systematic evaluation of EfficientNet, DINOv2, and CSPDarknet backbones, (2) integration of Sophia second-order optimizer with CSPDarknet for curvature-aware updates, and (3) cross-dataset validation in diverse construction scenarios. Traditional manual PPE inspection exhibits operational limitations, including high error rates (12–15%) and labor-intensive processes. SC-YOLO addresses these challenges through automated detection with potential for real-time deployment in construction safety applications. Experiments on VOC2007-1 and ML-31005 datasets demonstrate improved performance, achieving 96.3–97.6% mAP@0.5 and 63.6–68.6% mAP@0.5:0.95. Notable gains include a 9.03% improvement in detecting transparent objects. The second-order optimization achieves faster convergence with 7% computational overhead compared to baseline methods, showing enhanced robustness over conventional YOLO variants in complex construction environments. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

26 pages, 62045 KB  
Article
CML-RTDETR: A Lightweight Wheat Head Detection and Counting Algorithm Based on the Improved RT-DETR
by Yue Fang, Chenbo Yang, Chengyong Zhu, Hao Jiang, Jingmin Tu and Jie Li
Electronics 2025, 14(15), 3051; https://doi.org/10.3390/electronics14153051 - 30 Jul 2025
Viewed by 420
Abstract
Wheat is one of the important grain crops, and spike counting is crucial for predicting spike yield. However, in complex farmland environments, the wheat body scale has huge differences, its color is highly similar to the background, and wheat ears often overlap with [...] Read more.
Wheat is one of the important grain crops, and spike counting is crucial for predicting spike yield. However, in complex farmland environments, the wheat body scale has huge differences, its color is highly similar to the background, and wheat ears often overlap with each other, which makes wheat ear detection work face a lot of challenges. At the same time, the increasing demand for high accuracy and fast response in wheat spike detection has led to the need for models to be lightweight function with reduced the hardware costs. Therefore, this study proposes a lightweight wheat ear detection model, CML-RTDETR, for efficient and accurate detection of wheat ears in real complex farmland environments. In the model construction, the lightweight network CSPDarknet is firstly introduced as the backbone network of CML-RTDETR to enhance the feature extraction efficiency. In addition, the FM module is cleverly introduced to modify the bottleneck layer in the C2f component, and hybrid feature extraction is realized by spatial and frequency domain splicing to enhance the feature extraction capability of wheat to be tested in complex scenes. Secondly, to improve the model’s detection capability for targets of different scales, a multi-scale feature enhancement pyramid (MFEP) is designed, consisting of GHSDConv, for efficiently obtaining low-level detail information and CSPDWOK for constructing a multi-scale semantic fusion structure. Finally, channel pruning based on Layer-Adaptive Magnitude Pruning (LAMP) scoring is performed to reduce model parameters and runtime memory. The experimental results on the GWHD2021 dataset show that the AP50 of CML-RTDETR reaches 90.5%, which is an improvement of 1.2% compared to the baseline RTDETR-R18 model. Meanwhile, the parameters and GFLOPs have been decreased to 11.03 M and 37.8 G, respectively, resulting in a reduction of 42% and 34%, respectively. Finally, the real-time frame rate reaches 73 fps, significantly achieving parameter simplification and speed improvement. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

22 pages, 4741 KB  
Article
Research on Tunnel Crack Identification Localization and Segmentation Method Based on Improved YOLOX and UNETR++
by Wei Sun, Xiaohu Liu and Zhiyong Lei
Sensors 2025, 25(11), 3417; https://doi.org/10.3390/s25113417 - 29 May 2025
Viewed by 618
Abstract
To address the challenges in identifying and segmenting fine irregular cracks in tunnels, this paper proposes a new crack identification, localization and segmentation method based on improved YOLOX and UNETR++. The improved YOLOX recognition algorithm builds upon the original YOLOX network architecture. It [...] Read more.
To address the challenges in identifying and segmenting fine irregular cracks in tunnels, this paper proposes a new crack identification, localization and segmentation method based on improved YOLOX and UNETR++. The improved YOLOX recognition algorithm builds upon the original YOLOX network architecture. It replaces the original CSPDarknet backbone with EfficientNet to enhance multi-scale feature extraction while preserving fine texture characteristics of tunnel cracks. By integrating a lightweight ECA module, the proposed method significantly improves sensitivity to subtle crack features, enabling high-precision identification and localization of fine irregular cracks. The UNETR++ segmentation network is adopted to realize efficient and accurate segmentation of fine irregular cracks in tunnels through global feature capture capability and a multi-scale feature fusion mechanism. The experimental results demonstrate that the proposed method achieves integrated processing of crack identification, localization and segmentation, especially for fine and irregular cracks identification and segmentation. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

23 pages, 5784 KB  
Article
RT-DETR-EVD: An Emergency Vehicle Detection Method Based on Improved RT-DETR
by Jun Hu, Jiahao Zheng, Wenwei Wan, Yongqi Zhou and Zhikai Huang
Sensors 2025, 25(11), 3327; https://doi.org/10.3390/s25113327 - 26 May 2025
Cited by 2 | Viewed by 1686
Abstract
With the rapid acceleration of urbanization and the increasing volume of road traffic, emergency vehicles frequently encounter congestion when performing urgent tasks. Failure to yield in a timely manner can result in the loss of critical rescue time. Therefore, this study aims to [...] Read more.
With the rapid acceleration of urbanization and the increasing volume of road traffic, emergency vehicles frequently encounter congestion when performing urgent tasks. Failure to yield in a timely manner can result in the loss of critical rescue time. Therefore, this study aims to develop a lightweight and high-precision RT-DETR-EVD emergency vehicle detection model to enhance urban emergency response capabilities. The proposed model replaces ResNet with a lightweight CSPDarknet backbone and integrates an innovative hybrid C2f-MogaBlock architecture. A multi-order gated aggregation mechanism is introduced to dynamically fuse multi-scale features, improving spatial-channel feature representation while reducing the number of parameters. Additionally, an Attention-based Intra-scale Feature Interaction Dynamic Position Bias (AIDPB) module is designed, replacing fixed positional encoding with learnable dynamic position bias (DPB), improving feature discrimination in complex scenarios. The experimental results demonstrate that the improved RT-DETR-EVD model achieves superior performance in emergency vehicle detection under the same training conditions. Specifically, compared to the baseline RT-DETR-r18 model, RT-DETR-EVD reduces parameter count to 14.5 M (a 27.1% reduction), lowers floating-point operations (FLOPs) to 49.5 G (a 13.2% reduction), and improves precision by 0.5%. Additionally, recall and mean average precision (mAP50%) increase by 0.6%, reaching an mAP50% of 88.3%. The proposed RT-DETR-EVD model achieves a breakthrough balance between accuracy, efficiency, and scene adaptability. Its unique lightweight design enhances detection accuracy while significantly reducing model size and accelerating inference. This model provides an efficient and reliable solution for smart city emergency response systems, demonstrating strong deployment potential in real-world engineering applications. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

19 pages, 5134 KB  
Article
A Garbage Detection and Classification Model for Orchards Based on Lightweight YOLOv7
by Xinyuan Tian, Liping Bai and Deyun Mo
Sustainability 2025, 17(9), 3922; https://doi.org/10.3390/su17093922 - 27 Apr 2025
Cited by 2 | Viewed by 899
Abstract
The disposal of orchard garbage (including pruning branches, fallen leaves, and non-biodegradable materials such as pesticide containers and plastic film) poses major difficulties for horticultural production and soil sustainability. Unlike general agricultural garbage, orchard garbage often contains both biodegradable organic matter and hazardous [...] Read more.
The disposal of orchard garbage (including pruning branches, fallen leaves, and non-biodegradable materials such as pesticide containers and plastic film) poses major difficulties for horticultural production and soil sustainability. Unlike general agricultural garbage, orchard garbage often contains both biodegradable organic matter and hazardous pollutants, which complicates efficient recycling. Traditional manual sorting methods are labour-intensive and inefficient in large-scale operations. To this end, we propose a lightweight YOLOv7-based detection model tailored for the orchard environment. By replacing the CSPDarknet53 backbone with MobileNetV3 and GhostNet, an average accuracy (mAP) of 84.4% is achieved, while the computational load of the original model is only 16%. Meanwhile, a supervised comparative learning strategy further strengthens feature discrimination between horticulturally relevant categories and can distinguish compost pruning residues from toxic materials. Experiments on a dataset containing 16 orchard-specific garbage types (e.g., pineapple shells, plastic mulch, and fertiliser bags) show that the model has high classification accuracy, especially for materials commonly found in tropical orchards. The lightweight nature of the algorithm allows for real-time deployment on edge devices such as drones or robotic platforms, and future integration with robotic arms for automated collection and sorting. By converting garbage into a compostable resource and separating contaminants, the technology is aligned with the country’s garbage segregation initiatives and global sustainability goals, providing a scalable pathway to reconcile ecological preservation and horticultural efficiency. Full article
Show Figures

Figure 1

20 pages, 4165 KB  
Article
Paint Loss Detection and Segmentation Based on YOLO: An Improved Model for Ancient Murals and Color Paintings
by Yunsheng Chen, Aiwu Zhang, Jiancong Shi, Feng Gao, Juwen Guo and Ruizhe Wang
Heritage 2025, 8(4), 136; https://doi.org/10.3390/heritage8040136 - 11 Apr 2025
Cited by 1 | Viewed by 783
Abstract
Paint loss is one of the major forms of deterioration in ancient murals and color paintings, and its detection and segmentation are critical for subsequent restoration efforts. However, existing methods still suffer from issues such as incomplete segmentation, patch noise, and missed detections [...] Read more.
Paint loss is one of the major forms of deterioration in ancient murals and color paintings, and its detection and segmentation are critical for subsequent restoration efforts. However, existing methods still suffer from issues such as incomplete segmentation, patch noise, and missed detections during paint loss extraction, limiting the automation of paint loss detection and restoration. To tackle these challenges, this paper proposes PLDS-YOLO, an improved model based on YOLOv8s-seg, specifically designed for the detection and segmentation of paint loss in ancient murals and color paintings. First, the PA-FPN network is optimized by integrating residual connections to enhance the fusion of shallow high-resolution features with deep semantic features, thereby improving the accuracy of edge extraction in deteriorated areas. Second, a dual-backbone network combining CSPDarkNet and ShuffleNet V2 is introduced to improve multi-scale feature extraction and enhance the discrimination of deteriorated areas. Third, SPD-Conv replaces traditional pooling layers, utilizing space-to-depth transformation to improve the model’s ability to perceive deteriorated areas of varying sizes. Experimental results on a self-constructed dataset demonstrate that PLDS-YOLO achieves a segmentation accuracy of 86.2%, outperforming existing methods in segmentation completeness, multi-scale deterioration detection, and small target recognition. Moreover, the model maintains a favorable balance between computational complexity and inference speed, providing reliable technical support for intelligent paint loss monitoring and digital restoration. Full article
Show Figures

Figure 1

26 pages, 7308 KB  
Article
Recognition of Cordyceps Based on Machine Vision and Deep Learning
by Zihao Xia, Aimin Sun, Hangdong Hou, Qingfeng Song, Hongli Yang, Liyong Ma and Fang Dong
Agriculture 2025, 15(7), 713; https://doi.org/10.3390/agriculture15070713 - 27 Mar 2025
Viewed by 555
Abstract
In a natural environment, due to the small size of caterpillar fungus, its indistinct features, similar color to surrounding weeds and background, and overlapping instances of caterpillar fungus, identifying caterpillar fungus poses significant challenges. To address these issues, this paper proposes a new [...] Read more.
In a natural environment, due to the small size of caterpillar fungus, its indistinct features, similar color to surrounding weeds and background, and overlapping instances of caterpillar fungus, identifying caterpillar fungus poses significant challenges. To address these issues, this paper proposes a new MRAA network, which consists of a feature fusion pyramid network (MRFPN) and the backbone network N-CSPDarknet53. MRFPN is used to solve the problem of weak features. In N-CSPDarknet53, the Da-Conv module is proposed to address the background and color interference problems in shallow feature maps. The MRAA network significantly improves accuracy, achieving an accuracy rate of 0.202 APS for small-target recognition, which represents a 12% increase compared to the baseline of 0.180 APS. Additionally, the model size is small (9.88 M), making it lightweight. It is easy to deploy in embedded devices, which greatly promotes the development and application of caterpillar fungus identification. Full article
Show Figures

Figure 1

19 pages, 2144 KB  
Article
PDNet by Partial Deep Convolution: A Better Lightweight Detector
by Wei Wang, Yuanze Meng, Han Li, Shun Li, Chenghong Zhang, Guanghui Zhang and Weimin Lei
Electronics 2025, 14(3), 591; https://doi.org/10.3390/electronics14030591 - 2 Feb 2025
Viewed by 804
Abstract
Model lightweighting is significant in edge computing and mobile devices. Current studies on fast network design mainly focuses on model computation compression and speedup. Many models aim to compress models by dealing with redundant feature maps. However, most of these methods choose to [...] Read more.
Model lightweighting is significant in edge computing and mobile devices. Current studies on fast network design mainly focuses on model computation compression and speedup. Many models aim to compress models by dealing with redundant feature maps. However, most of these methods choose to preserve the feature maps with simple manipulations and do not effectively reduce redundant feature maps. This paper proposes a new convolution module, PDConv, which compresses redundant feature maps to reduce network complexity and increase network width to maintain accuracy. PDConv (Partial Deep Convolution) outperforms traditional methods in handling redundant feature maps, particularly in deep networks. Its FLOPs are comparable to deep separable convolution but with higher accuracy. This paper proposes PDBottleNeck and PDC2f (Partial Deep CSPDarknet53 to 2-Stage FPN) and build the lightweight network PDNet for experimental validation using the PASCAL VOC dataset. Compared to the popular HorNet, our method achieves an improvement of more than 25% in FLOPs and 1.8% in mAP50:95 accuracy. On the CoCo2017 dataset, our large PDNet achieves a 0.5% improvement in mAP75 and lower FLOPs than the latest RepVit. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 1025 KB  
Article
Empirical Evaluation and Analysis of YOLO Models in Smart Transportation
by Lan Anh Nguyen, Manh Dat Tran and Yongseok Son
AI 2024, 5(4), 2518-2537; https://doi.org/10.3390/ai5040122 - 26 Nov 2024
Cited by 5 | Viewed by 2378
Abstract
You Only Look Once (YOLO) and its variants have emerged as the most popular real-time object detection algorithms. They have been widely used in real-time smart transportation applications due to their low-latency detection and high accuracy. However, because of the diverse characteristics of [...] Read more.
You Only Look Once (YOLO) and its variants have emerged as the most popular real-time object detection algorithms. They have been widely used in real-time smart transportation applications due to their low-latency detection and high accuracy. However, because of the diverse characteristics of YOLO models, selecting the optimal model according to various applications and environments in smart transportation is critical. In this article, we conduct an empirical evaluation and analysis study for most YOLO versions to assess their performance in smart transportation. To achieve this, we first measure the average precision of YOLO models across multiple datasets (i.e., COCO and PASCAL VOC). Second, we analyze the performance of YOLO models on multiple object categories within each dataset, focusing on classes relevant to road transportation such as those commonly used in smart transportation applications. Third, multiple Intersection over Union (IoU) thresholds are considered in our performance measurement and analysis. By examining the performance of various YOLO models across datasets, IoU thresholds, and object classes, we make six observations on these three aspects while aiming to identify optimal models for road transportation scenarios. It was found that YOLOv5 and YOLOv8 outperform other models in all three aspects due to their novel performance features. For instance, YOLOv5 achieves stable performance thanks to its cross-stage partial darknet-53 (CSPDarknet53) backbone, auto-anchor mechanism, and efficient loss functions including IoU loss, complete IoU loss, focal loss, gradient harmonizing mechanism loss. Similarly, YOLOv8 outperforms others with its upgraded CSPDarknet53 backbone, anchor-free mechanism, and efficient loss functions like complete IoU loss and distribution focal loss. Full article
Show Figures

Figure 1

17 pages, 2483 KB  
Article
Fire and Smoke Detection in Complex Environments
by Furkat Safarov, Shakhnoza Muksimova, Misirov Kamoliddin and Young Im Cho
Fire 2024, 7(11), 389; https://doi.org/10.3390/fire7110389 - 29 Oct 2024
Cited by 15 | Viewed by 2635
Abstract
Fire detection is a critical task in environmental monitoring and disaster prevention, with traditional methods often limited in their ability to detect fire and smoke in real time over large areas. The rapid identification of fire and smoke in both indoor and outdoor [...] Read more.
Fire detection is a critical task in environmental monitoring and disaster prevention, with traditional methods often limited in their ability to detect fire and smoke in real time over large areas. The rapid identification of fire and smoke in both indoor and outdoor environments is essential for minimizing damage and ensuring timely intervention. In this paper, we propose a novel approach to fire and smoke detection by integrating a vision transformer (ViT) with the YOLOv5s object detection model. Our modified model leverages the attention-based feature extraction capabilities of ViTs to improve detection accuracy, particularly in complex environments where fires may be occluded or distributed across large regions. By replacing the CSPDarknet53 backbone of YOLOv5s with ViT, the model is able to capture both local and global dependencies in images, resulting in more accurate detection of fire and smoke under challenging conditions. We evaluate the performance of the proposed model using a comprehensive Fire and Smoke Detection Dataset, which includes diverse real-world scenarios. The results demonstrate that our model outperforms baseline YOLOv5 variants in terms of precision, recall, and mean average precision (mAP), achieving a mAP@0.5 of 0.664 and a recall of 0.657. The modified YOLOv5s with ViT shows significant improvements in detecting fire and smoke, particularly in scenes with complex backgrounds and varying object scales. Our findings suggest that the integration of ViT as the backbone of YOLOv5s offers a promising approach for real-time fire detection in both urban and natural environments. Full article
Show Figures

Figure 1

14 pages, 4478 KB  
Article
A New Kiwi Fruit Detection Algorithm Based on an Improved Lightweight Network
by Yi Yang, Lijun Su, Aying Zong, Wanghai Tao, Xiaoping Xu, Yixin Chai and Weiyi Mu
Agriculture 2024, 14(10), 1823; https://doi.org/10.3390/agriculture14101823 - 16 Oct 2024
Cited by 3 | Viewed by 1753
Abstract
To address the challenges associated with kiwi fruit detection methods, such as low average accuracy, inaccurate recognition of fruits, and long recognition time, this study proposes a novel kiwi fruit recognition method based on an improved lightweight network S-YOLOv4-tiny detection algorithm. Firstly, the [...] Read more.
To address the challenges associated with kiwi fruit detection methods, such as low average accuracy, inaccurate recognition of fruits, and long recognition time, this study proposes a novel kiwi fruit recognition method based on an improved lightweight network S-YOLOv4-tiny detection algorithm. Firstly, the YOLOv4-tiny algorithm utilizes the CSPdarknet53-tiny network as a backbone feature extraction network, replacing the CSPdarknet53 network in the YOLOv4 algorithm to enhance the speed of kiwi fruit recognition. Additionally, a squeeze-and-excitation network has been incorporated into the S-YOLOv4-tiny detection algorithm to improve accurate image extraction of kiwi fruit characteristics. Finally, enhancing dataset pictures using mosaic methods has improved precision in the characteristic recognition of kiwi fruits. The experimental results demonstrate that the recognition and positioning of kiwi fruits have yielded improved outcomes. The mean average precision (mAP) stands at 89.75%, with a detection precision of 93.96% and a single-picture detection time of 8.50 ms. Compared to the YOLOv4-tiny detection algorithm network, the network in this study exhibits a 7.07% increase in mean average precision and a 1.16% acceleration in detection time. Furthermore, an enhancement method based on the Squeeze-and-Excitation Network (SENet) is proposed, as opposed to the convolutional block attention module (CBAM) and efficient channel attention (ECA). This approach effectively addresses issues related to slow training speed and low recognition accuracy of kiwi fruit, offering valuable technical insights for efficient mechanical picking methods. Full article
Show Figures

Figure 1

20 pages, 4849 KB  
Article
Occlusion Removal in Light-Field Images Using CSPDarknet53 and Bidirectional Feature Pyramid Network: A Multi-Scale Fusion-Based Approach
by Mostafa Farouk Senussi and Hyun-Soo Kang
Appl. Sci. 2024, 14(20), 9332; https://doi.org/10.3390/app14209332 - 13 Oct 2024
Cited by 8 | Viewed by 2901
Abstract
Occlusion removal in light-field images remains a significant challenge, particularly when dealing with large occlusions. An architecture based on end-to-end learning is proposed to address this challenge that interactively combines CSPDarknet53 and the bidirectional feature pyramid network for efficient light-field occlusion removal. CSPDarknet53 [...] Read more.
Occlusion removal in light-field images remains a significant challenge, particularly when dealing with large occlusions. An architecture based on end-to-end learning is proposed to address this challenge that interactively combines CSPDarknet53 and the bidirectional feature pyramid network for efficient light-field occlusion removal. CSPDarknet53 acts as the backbone, providing robust and rich feature extraction across multiple scales, while the bidirectional feature pyramid network enhances comprehensive feature integration through an advanced multi-scale fusion mechanism. To preserve efficiency without sacrificing the quality of the extracted feature, our model uses separable convolutional blocks. A simple refinement module based on half-instance initialization blocks is integrated to explore the local details and global structures. The network’s multi-perspective approach guarantees almost total occlusion removal, enabling it to handle occlusions of varying sizes or complexity. Numerous experiments were run on sparse and dense datasets with varying degrees of occlusion severity in order to assess the performance. Significant advancements over the current cutting-edge techniques are shown in the findings for the sparse dataset, while competitive results are obtained for the dense dataset. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision)
Show Figures

Figure 1

20 pages, 2415 KB  
Article
YOLOv8-RCAA: A Lightweight and High-Performance Network for Tea Leaf Disease Detection
by Jingyu Wang, Miaomiao Li, Chen Han and Xindong Guo
Agriculture 2024, 14(8), 1240; https://doi.org/10.3390/agriculture14081240 - 27 Jul 2024
Cited by 3 | Viewed by 1616
Abstract
Deploying deep convolutional neural networks on agricultural devices with limited resources is challenging due to their large number of parameters. Existing lightweight networks can alleviate this problem but suffer from low performance. To this end, we propose a novel lightweight network named YOLOv8-RCAA [...] Read more.
Deploying deep convolutional neural networks on agricultural devices with limited resources is challenging due to their large number of parameters. Existing lightweight networks can alleviate this problem but suffer from low performance. To this end, we propose a novel lightweight network named YOLOv8-RCAA (YOLOv8-RepVGG-CBAM-Anchorfree-ATSS), aiming to locate and detect tea leaf diseases with high accuracy and performance. Specifically, we employ RepVGG to replace CSPDarkNet63 to enhance feature extraction capability and inference efficiency. Then, we introduce CBAM attention to FPN and PAN in the neck layer to enhance the model perception of channel and spatial features. Additionally, an anchor-based detection head is replaced by an anchor-free head to further accelerate inference. Finally, we adopt the ATSS algorithm to adapt the allocating strategy of positive and negative samples during training to further enhance performance. Extensive experiments show that our model achieves precision, recall, F1 score, and mAP of 98.23%, 85.34%, 91.33%, and 98.14%, outperforming the traditional models by 4.22~6.61%, 2.89~4.65%, 3.48~5.52%, and 4.64~8.04%, respectively. Moreover, this model has a near-real-time inference speed, which provides technical support for deploying on agriculture devices. This study can reduce labor costs associated with the detection and prevention of tea leaf diseases. Additionally, it is expected to promote the integration of rapid disease detection into agricultural machinery in the future, thereby advancing the implementation of AI in agriculture. Full article
Show Figures

Figure 1

Back to TopTop