Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (152)

Search Parameters:
Keywords = point downsampling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 1086 KiB  
Article
Focusing 3D Small Objects with Object Matching Set Abstraction
by Lei Guo, Ningdong Song, Jindong Hu, Huiyan Han, Xie Han and Fengguang Xiong
Appl. Sci. 2025, 15(8), 4121; https://doi.org/10.3390/app15084121 - 9 Apr 2025
Viewed by 198
Abstract
Currently, 3D object detection methods fail to detect small objects due to the fewer effective points of small objects. It is a significant challenge to reduce the loss of information of points in representation learning. To this end, we propose an effective 3D [...] Read more.
Currently, 3D object detection methods fail to detect small objects due to the fewer effective points of small objects. It is a significant challenge to reduce the loss of information of points in representation learning. To this end, we propose an effective 3D detection method with object matching set abstraction (OMSA). We observe that key points are lost during feature learning with multiple set abstraction layers, especially for downsampling and queries. Therefore, we present a novel sampling module named focus-based sampling, which raises the sampling probability of small objects. In addition, we design a multi-scale cube query to match the small objects with a close geometric alignment. Our comprehensive experimental evaluations on the KITTI 3D benchmark demonstrate significant performance improvements in 3D object detection. Notably, the proposed framework exhibits competitive detection accuracy for small objects (pedestrians and cyclists). Through an ablation study, we verify that each module contributes to the performance enhancement and demonstrate the robustness of the method against the balance factor. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

21 pages, 8937 KiB  
Article
LSOD-YOLOv8: Enhancing YOLOv8n with New Detection Head and Lightweight Module for Efficient Cigarette Detection
by Yijie Huang, Huimin Ouyang and Xiaodong Miao
Appl. Sci. 2025, 15(7), 3961; https://doi.org/10.3390/app15073961 - 3 Apr 2025
Viewed by 449
Abstract
Cigarette detection is a crucial component of public safety management. However, detecting such small objects poses significant challenges due to their size and limited feature points. To enhance the accuracy of small target detection, we propose a novel small object detection model, LSOD-YOLOv8 [...] Read more.
Cigarette detection is a crucial component of public safety management. However, detecting such small objects poses significant challenges due to their size and limited feature points. To enhance the accuracy of small target detection, we propose a novel small object detection model, LSOD-YOLOv8 (Lightweight Small Object Detection using YOLOv8). First, we introduce a lightweight adaptive weight downsampling module in the backbone layer of YOLOv8 (You Only Look Once version 8), which not only mitigates information loss caused by conventional convolutions but also reduces the overall parameter count of the model. Next, we incorporate a P2 layer (Pyramid Pooling Layer 2) in the neck of YOLOv8, blending the concepts of shared convolutional information and independent batch normalization to design a P2-LSCSBD (P2 Layer-Lightweight Shared Convolutional and Batch Normalization-based Small Object Detection) detection head. Finally, we propose a new loss function, WIMIoU (Weighted Intersection over Union with Inner, Multi-scale, and Proposal-aware Optimization), by combining the ideas of WiseIoU (Wise Intersection over Union), InnerIoU (Inner Intersection over Union), and MPDIoU (Mean Pairwise Distance Intersection over Union), resulting in a significant accuracy improvement without any loss in performance. Our experiments demonstrate that LSOD-YOLOv8 enhances detection accuracy for cigarette detection specifically. Full article
Show Figures

Figure 1

23 pages, 2392 KiB  
Article
MDFusion: Multi-Dimension Semantic–Spatial Feature Fusion for LiDAR–Camera 3D Object Detection
by Renzhong Qiao, Hao Yuan, Zhenbo Guan and Wenbo Zhang
Remote Sens. 2025, 17(7), 1240; https://doi.org/10.3390/rs17071240 - 31 Mar 2025
Viewed by 522
Abstract
Accurate 3D object detection is becoming increasingly vital for the development of robust perception systems, particularly in applications such as autonomous driving vehicles and robotic systems. Many existing approaches rely on bird’s eye view (BEV) feature maps to facilitate multi-modal interaction, as BEV [...] Read more.
Accurate 3D object detection is becoming increasingly vital for the development of robust perception systems, particularly in applications such as autonomous driving vehicles and robotic systems. Many existing approaches rely on bird’s eye view (BEV) feature maps to facilitate multi-modal interaction, as BEV representations enable efficient operations. However, the inherent sparsity of LiDAR BEV features often leads to misalignment with the dense semantic information in camera images, resulting in suboptimal fusion quality and degraded detection performance, especially in complex and dynamic environments. To mitigate these issues, this paper proposes a novel multi-dimension semantic–spatial feature fusion (MDFusion) method that combines LiDAR and image features in 2D and 3D spaces. Specifically, image semantic features are extracted using the DeepLabV3 segmentation network, which captures rich contextual information and is aligned with LiDAR point cloud voxel features through a summation operation to achieve precise semantic fusion. Additionally, LiDAR BEV features are fused with downsampled image features in 2D space via concatenation and spatially adaptive dilated convolution. The mechanism dynamically adjusts to the spatial characteristics of the data, ensuring robust feature integration. Extensive experiments on the KITTI and ONCE datasets demonstrate that our method achieves competitive performance in complex scenes, significantly improving the multi-modal fusion quality and detection accuracy while maintaining computational efficiency. Full article
Show Figures

Figure 1

19 pages, 5246 KiB  
Article
Application of 4PCS and KD-ICP Alignment Methods Based on ISS Feature Points for Rail Wear Detection
by Jie Shan, Hao Shi and Zhi Niu
Appl. Sci. 2025, 15(7), 3455; https://doi.org/10.3390/app15073455 - 21 Mar 2025
Viewed by 258
Abstract
In order to detect the abrasion of rails, a new point cloud alignment method combining 4-points congruent sets (4PCS) coarse alignment based on internal shape signature (ISS) and K-dimensional iterative closest points (KD-ICP) fine alignment is proposed, and for the first time, the [...] Read more.
In order to detect the abrasion of rails, a new point cloud alignment method combining 4-points congruent sets (4PCS) coarse alignment based on internal shape signature (ISS) and K-dimensional iterative closest points (KD-ICP) fine alignment is proposed, and for the first time, the combined algorithm is applied to the detection of rail wear. Due to the large amount of 3D rail point cloud data collected by the 3D line laser sensor, the original data are first downsampled by voxel filtering. Then, ISS feature points are extracted from the processed point cloud data for 4PCS coarse alignment, and the feature points are quantitatively analyzed, which in turn provides good alignment conditions for fine alignment. Then, the K-dimensional tree structure is used for the near-neighbor search to improve the alignment efficiency of the ICP algorithm. Finally, the total rail wear is calculated by combining the fine alignment results with the wear calculation formula. The experimental results show that when the number of ISS feature points extracted is 4496, the 4PCS coarse alignment algorithm based on ISS feature points is higher than the original 4PCS algorithm as well as the other algorithms in terms of alignment accuracy; the ICP fine alignment algorithm based on the kd-tree is less than the original ICP algorithm as well as the other algorithms in terms of the time consumed. Further, the proposed new ISS-4PCS + KD-ICP two-stage point cloud alignment method is superior to the original 4PCS + ICP algorithm both in terms of alignment accuracy and runtime. The combined algorithm is applied to the detection of rail wear for the first time, which provides a reference for the non-contact rail wear detection method. The high accuracy and low time consumption of the proposed algorithm lays a good foundation for the calculation of rail wear in the next step. Full article
Show Figures

Figure 1

25 pages, 9187 KiB  
Article
Digital Reconstruction Method for Low-Illumination Road Traffic Accident Scenes Using UAV and Auxiliary Equipment
by Xinyi Zhang, Zhiwei Guan, Xiaofeng Liu and Zejiang Zhang
World Electr. Veh. J. 2025, 16(3), 171; https://doi.org/10.3390/wevj16030171 - 14 Mar 2025
Viewed by 415
Abstract
In low-illumination environments, traditional traffic accident survey methods struggle to obtain high-quality data. This paper proposes a traffic accident reconstruction method utilizing an unmanned aerial vehicle (UAV) and auxiliary equipment. Firstly, a methodological framework for investigating traffic accidents under low-illumination conditions is developed. [...] Read more.
In low-illumination environments, traditional traffic accident survey methods struggle to obtain high-quality data. This paper proposes a traffic accident reconstruction method utilizing an unmanned aerial vehicle (UAV) and auxiliary equipment. Firstly, a methodological framework for investigating traffic accidents under low-illumination conditions is developed. Accidents are classified based on the presence of obstructions, and corresponding investigation strategies are formulated. As for the unobstructed scene, a UAV-mounted LiDAR scans the accident site to generate a comprehensive point cloud model. In the partially obstructed scene, a ground-based mobile laser scanner complements the areas that are obscured or inaccessible to the UAV-mounted LiDAR. Subsequently, the collected point cloud data are processed with a multiscale voxel iteration method for down-sampling to determine optimal parameters. Then, the improved normal distributions transform (NDT) algorithm and different filtering algorithms are adopted to register the ground and air point clouds, and the optimal combination of algorithms is selected, thus, to reconstruct a high-precision 3D point cloud model of the accident scene. Finally, two nighttime traffic accident scenarios are conducted. DJI Zenmuse L1 UAV LiDAR system and EinScan Pro 2X mobile scanner are selected for survey reconstruction. In both experiments, the proposed method achieved RMSE values of 0.0427 m and 0.0451 m, outperforming traditional aerial photogrammetry-based modeling with RMSE values of 0.0466 m and 0.0581 m. The results demonstrate that this method can efficiently and accurately investigate low-illumination traffic accident scenes without being affected by obstructions, providing valuable technical support for refined traffic management and accident analysis. Moreover, the challenges and future research directions are discussed. Full article
Show Figures

Figure 1

16 pages, 14380 KiB  
Article
Online Calibration Method of LiDAR and Camera Based on Fusion of Multi-Scale Cost Volume
by Xiaobo Han, Jie Luo, Xiaoxu Wei and Yongsheng Wang
Information 2025, 16(3), 223; https://doi.org/10.3390/info16030223 - 13 Mar 2025
Viewed by 637
Abstract
The online calibration algorithm for camera and LiDAR helps solve the problem of multi-sensor fusion and is of great significance in autonomous driving perception algorithms. Existing online calibration algorithms fail to account for both real-time performance and accuracy. High-precision calibration algorithms require high [...] Read more.
The online calibration algorithm for camera and LiDAR helps solve the problem of multi-sensor fusion and is of great significance in autonomous driving perception algorithms. Existing online calibration algorithms fail to account for both real-time performance and accuracy. High-precision calibration algorithms require high hardware requirements, while it is difficult for lightweight calibration algorithms to meet the accuracy requirements. Secondly, sensor noise, vibration, and changes in environmental conditions may reduce calibration accuracy. In addition, due to the large domain differences between different public datasets, the existing online calibration algorithms are unstable for various datasets and have poor algorithm robustness. To solve the above problems, we propose an online calibration algorithm based on multi-scale cost volume fusion. First, a multi-layer convolutional network is used to downsample and concatenate the camera RGB data and LiDAR point cloud data to obtain three-scale feature maps. The latter is then subjected to feature concatenation and group-wise correlation processing to generate three sets of cost volumes of different scales. After that, all the cost volumes are spliced and sent to the pose estimation module. After post-processing, the translation and rotation matrix between the camera and LiDAR coordinate systems can be obtained. We tested and verified this method on the KITTI odometry dataset and measured the average translation error of the calibration results to be 0.278 cm, the average rotation error to be 0.020°, and the single frame took 23 ms, reaching the advanced level. Full article
Show Figures

Graphical abstract

20 pages, 3968 KiB  
Article
Research on Multi-Scale Point Cloud Completion Method Based on Local Neighborhood Dynamic Fusion
by Yalun Liu, Jiantao Sun and Ling Zhao
Appl. Sci. 2025, 15(6), 3006; https://doi.org/10.3390/app15063006 - 10 Mar 2025
Viewed by 629
Abstract
Point cloud completion reconstructs incomplete, sparse inputs into complete 3D shapes. However, in the current 3D completion task, it is difficult to effectively extract the local details of an incomplete one, resulting in poor restoration of local details and low accuracy of the [...] Read more.
Point cloud completion reconstructs incomplete, sparse inputs into complete 3D shapes. However, in the current 3D completion task, it is difficult to effectively extract the local details of an incomplete one, resulting in poor restoration of local details and low accuracy of the completed point clouds. To address this problem, this paper proposes a multi-scale point cloud completion method based on local neighborhood dynamic fusion (LNDF: adaptive aggregation of multi-scale local features through dynamic range and weight adjustment). Firstly, the farthest point sampling (FPS) strategy is applied to the original incomplete and defective point clouds for down-sampling to obtain three types of point clouds at different scales. When extracting features from point clouds of different scales, the local neighborhood aggregation of key points is dynamically adjusted, and the Transformer architecture is integrated to further enhance the correlation of local feature extraction information. Secondly, by combining the method of generating point clouds layer by layer in a pyramid-like manner, the local details of the point clouds are gradually enriched from coarse to fine to achieve point cloud completion. Finally, when designing the decoder, inspired by the concept of generative adversarial networks (GANs), an attention discriminator designed in series with a feature extraction layer and an attention layer is added to further optimize the completion performance of the network. Experimental results show that LNDM-Net reduces the average Chamfer Distance (CD) by 5.78% on PCN and 4.54% on ShapeNet compared to SOTA. The visualization of completion results demonstrates the superior performance of our method in both point cloud completion accuracy and local detail preservation. When handling diverse samples and incomplete point clouds in real-world 3D scenarios from the KITTI dataset, the approach exhibits enhanced generalization capability and completion fidelity. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision)
Show Figures

Figure 1

19 pages, 7587 KiB  
Article
GPC-YOLO: An Improved Lightweight YOLOv8n Network for the Detection of Tomato Maturity in Unstructured Natural Environments
by Yaolin Dong, Jinwei Qiao, Na Liu, Yunze He, Shuzan Li, Xucai Hu, Chengyan Yu and Chengyu Zhang
Sensors 2025, 25(5), 1502; https://doi.org/10.3390/s25051502 - 28 Feb 2025
Viewed by 752
Abstract
Effective fruit identification and maturity detection are important for harvesting and managing tomatoes. Current deep learning detection algorithms typically demand significant computational resources and memory. Detecting severely stacked and obscured tomatoes in unstructured natural environments is challenging because of target stacking, target occlusion, [...] Read more.
Effective fruit identification and maturity detection are important for harvesting and managing tomatoes. Current deep learning detection algorithms typically demand significant computational resources and memory. Detecting severely stacked and obscured tomatoes in unstructured natural environments is challenging because of target stacking, target occlusion, natural illumination, and background noise. The proposed method involves a new lightweight model called GPC-YOLO based on YOLOv8n for tomato identification and maturity detection. This study proposes a C2f-PC module based on partial convolution (PConv) for less computation, which replaced the original C2f feature extraction module of YOLOv8n. The regular convolution was replaced with the lightweight Grouped Spatial Convolution (GSConv) by downsampling to reduce the computational burden. The neck network was replaced with the convolutional neural network-based cross-scale feature fusion (CCFF) module to enhance the adaptability of the model to scale changes and to detect many small-scaled objects. Additionally, the integration of the simple attention mechanism (SimAM) and efficient intersection over union (EIoU) loss were implemented to further enhance the detection accuracy by leveraging these lightweight improvements. The GPC-YOLO model was trained and validated on a dataset of 1249 mobile phone images of tomatoes. Compared to the original YOLOv8n, GPC-YOLO achieved high-performance metrics, e.g., reducing the parameter number to 1.2 M (by 59.9%), compressing the model size to 2.7 M (by 57.1%), decreasing the floating point of operations to 4.5 G (by 45.1%), and improving the accuracy to 98.7% (by 0.3%), with a detection speed of 201 FPS. This study showed that GPC-YOLO could effectively identify tomato fruit and detect fruit maturity in unstructured natural environments. The model has immense potential for tomato ripeness detection and automated picking applications. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

25 pages, 34424 KiB  
Article
Resampling Point Clouds Using Series of Local Triangulations
by Vijai Kumar Suriyababu, Cornelis Vuik and Matthias Möller
J. Imaging 2025, 11(2), 49; https://doi.org/10.3390/jimaging11020049 - 8 Feb 2025
Viewed by 960
Abstract
The increasing reliance on 3D scanning and meshless methods highlights the need for algorithms optimized for point-cloud geometry representations in CAE simulations. While voxel-based binning methods are simple, they often compromise geometry and topology, particularly with coarse voxelizations. We propose an algorithm based [...] Read more.
The increasing reliance on 3D scanning and meshless methods highlights the need for algorithms optimized for point-cloud geometry representations in CAE simulations. While voxel-based binning methods are simple, they often compromise geometry and topology, particularly with coarse voxelizations. We propose an algorithm based on a Series of Local Triangulations (SOLT) as an intermediate representation for point clouds, enabling efficient upsampling and downsampling. This robust and straightforward approach preserves the integrity of point clouds, ensuring resampling without feature loss or topological distortions. The proposed techniques integrate seamlessly into existing engineering workflows, avoiding complex optimization or machine learning methods while delivering reliable, high-quality results for a large number of examples. Resampled point clouds produced by our method can be directly used for solving PDEs or as input for surface reconstruction algorithms. We demonstrate the effectiveness of this approach with examples from mechanically sampled point clouds and real-world 3D scans. Full article
(This article belongs to the Special Issue Exploring Challenges and Innovations in 3D Point Cloud Processing)
Show Figures

Figure 1

20 pages, 9475 KiB  
Article
Cross-Domain Generalization for LiDAR-Based 3D Object Detection in Infrastructure and Vehicle Environments
by Peng Zhi, Longhao Jiang, Xiao Yang, Xingzheng Wang, Hung-Wei Li, Qingguo Zhou, Kuan-Ching Li and Mirjana Ivanović
Sensors 2025, 25(3), 767; https://doi.org/10.3390/s25030767 - 27 Jan 2025
Viewed by 1045
Abstract
In the intelligent transportation field, the Internet of Things (IoT) is commonly applied using 3D object detection as a crucial part of Vehicle-to-Everything (V2X) cooperative perception. However, challenges arise from discrepancies in sensor configurations between vehicles and infrastructure, leading to variations in the [...] Read more.
In the intelligent transportation field, the Internet of Things (IoT) is commonly applied using 3D object detection as a crucial part of Vehicle-to-Everything (V2X) cooperative perception. However, challenges arise from discrepancies in sensor configurations between vehicles and infrastructure, leading to variations in the scale and heterogeneity of point clouds. To address the performance differences caused by the generalization problem of 3D object detection models with heterogeneous LiDAR point clouds, we propose the Dual-Channel Generalization Neural Network (DCGNN), which incorporates a novel data-level downsampling and calibration module along with a cross-perspective Squeeze-and-Excitation attention mechanism for improved feature fusion. Experimental results using the DAIR-V2X dataset indicate that DCGNN outperforms detectors trained on single datasets, demonstrating significant improvements over selected baseline models. Full article
(This article belongs to the Special Issue Connected Vehicles and Vehicular Sensing in Smart Cities)
Show Figures

Figure 1

21 pages, 1358 KiB  
Article
A 3D Face Recognition Algorithm Directly Applied to Point Clouds
by Xingyi You and Xiaohu Zhao
Biomimetics 2025, 10(2), 70; https://doi.org/10.3390/biomimetics10020070 - 23 Jan 2025
Viewed by 1113
Abstract
Face recognition technology, despite its widespread use in various applications, still faces challenges related to occlusions, pose variations, and expression changes. Three-dimensional face recognition with depth information, particularly using point cloud-based networks, has shown effectiveness in overcoming these challenges. However, due to the [...] Read more.
Face recognition technology, despite its widespread use in various applications, still faces challenges related to occlusions, pose variations, and expression changes. Three-dimensional face recognition with depth information, particularly using point cloud-based networks, has shown effectiveness in overcoming these challenges. However, due to the limited extent of extensive 3D facial data and the non-rigid nature of facial structures, extracting distinct facial representations directly from point clouds remains challenging. To address this, our research proposes two key approaches. Firstly, we introduce a learning framework guided by a small amount of real face data based on morphable models with Gaussian processes. This system uses a novel method for generating large-scale virtual face scans, addressing the scarcity of 3D data. Secondly, we present a dual-branch network that directly extracts non-rigid facial features from point clouds, using kernel point convolution (KPConv) as its foundation. A local neighborhood adaptive feature learning module is introduced and employs context sampling technology, hierarchically downsampling feature-sensitive points critical for deep transfer and aggregation of discriminative facial features, to enhance the extraction of discriminative facial features. Notably, our training strategy combines large-scale face scanning data with 967 real face data from the FRGC v2.0 subset, demonstrating the effectiveness of guiding with a small amount of real face data. Experiments on the FRGC v2.0 dataset and the Bosphorus dataset demonstrate the effectiveness and potential of our method. Full article
(This article belongs to the Special Issue Exploration of Bioinspired Computer Vision and Pattern Recognition)
Show Figures

Figure 1

18 pages, 3867 KiB  
Article
Aluminum Electrolysis Fire-Eye Image Segmentation Based on the Improved U-Net Under Carbon Slag Interference
by Xuan Shi, Xiaofang Chen, Lihui Cen, Yongfang Xie and Zeyang Yin
Electronics 2025, 14(2), 336; https://doi.org/10.3390/electronics14020336 - 16 Jan 2025
Viewed by 601
Abstract
To solve the problem of low segmentation model accuracy due to the complex shape of carbon slag in the aluminum electrolysis fire-eye image and the blurring of the boundary between the slag and the surrounding electrolyte, this paper proposes a segmentation model of [...] Read more.
To solve the problem of low segmentation model accuracy due to the complex shape of carbon slag in the aluminum electrolysis fire-eye image and the blurring of the boundary between the slag and the surrounding electrolyte, this paper proposes a segmentation model of the fire-eye image based on an improved U-Net. The model reduces the depth of the traditional U-Net to four layers and uses the multiscale dilated convolution module (MDCM) in the down-sampling stage. Second, the Convolutional Block Attention Module (CBAM) is embedded in the skip connection part of the network to improve the ability of the model to extract contextual features from images of multiple scales, enhance the guidance of high-level features to low-level features, and make the model pay more attention to the critical regions. To alleviate the negative impact of the imbalance of positive and negative examples in the dataset, the weighted binary cross-entropy loss and the Dice loss are used to replace the traditional cross-entropy loss. The experimental results show that the segmentation accuracy of the improved model on the fire-eye dataset reaches 88.03%, which is 5.61 percentage points higher than U-Net. Full article
Show Figures

Figure 1

22 pages, 18757 KiB  
Article
CSGD-YOLO: A Corn Seed Germination Status Detection Model Based on YOLOv8n
by Wenbin Sun, Meihan Xu, Kang Xu, Dongquan Chen, Jianhua Wang, Ranbing Yang, Quanquan Chen and Songmei Yang
Agronomy 2025, 15(1), 128; https://doi.org/10.3390/agronomy15010128 - 7 Jan 2025
Cited by 5 | Viewed by 787
Abstract
Seed quality testing is crucial for ensuring food security and stability. To accurately detect the germination status of corn seeds during the paper medium germination test, this study proposes a corn seed germination status detection model based on YOLO v8n (CSGD-YOLO). Initially, to [...] Read more.
Seed quality testing is crucial for ensuring food security and stability. To accurately detect the germination status of corn seeds during the paper medium germination test, this study proposes a corn seed germination status detection model based on YOLO v8n (CSGD-YOLO). Initially, to alleviate the complexity encountered in conventional models, a lightweight spatial pyramid pooling fast (L-SPPF) structure is engineered to enhance the representation of features. Simultaneously, a detection module dubbed Ghost_Detection, leveraging the GhostConv architecture, is devised to boost detection efficiency while simultaneously reducing parameter counts and computational overhead. Additionally, during the downsampling process of the backbone network, a downsampling module based on receptive field attention convolution (RFAConv) is designed to boost the model’s focus on areas of interest. This study further proposes a new module named C2f-UIB-iAFF based on the faster implementation of cross-stage partial bottleneck with two convolutions (C2f), universal inverted bottleneck (UIB), and iterative attention feature fusion (iAFF) to replace the original C2f in YOLOv8, streamlining model complexity and augmenting the feature fusion prowess of the residual structure. Experiments conducted on the collected corn seed germination dataset show that CSGD-YOLO requires only 1.91 M parameters and 5.21 G floating-point operations (FLOPs). The detection precision(P), recall(R), mAP0.5, and mAP0.50:0.95 achieved are 89.44%, 88.82%, 92.99%, and 80.38%. Compared with the YOLO v8n, CSGD-YOLO improves performance in terms of accuracy, model size, parameter number, and floating-point operation counts by 1.39, 1.43, 1.77, and 2.95 percentage points, respectively. Therefore, CSGD-YOLO outperforms existing mainstream target detection models in detection performance and model complexity, making it suitable for detecting corn seed germination status and providing a reference for rapid germination rate detection. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

28 pages, 7288 KiB  
Article
Geometric Feature Characterization of Apple Trees from 3D LiDAR Point Cloud Data
by Md Rejaul Karim, Shahriar Ahmed, Md Nasim Reza, Kyu-Ho Lee, Joonjea Sung and Sun-Ok Chung
J. Imaging 2025, 11(1), 5; https://doi.org/10.3390/jimaging11010005 - 31 Dec 2024
Viewed by 1359
Abstract
The geometric feature characterization of fruit trees plays a role in effective management in orchards. LiDAR (light detection and ranging) technology for object detection enables the rapid and precise evaluation of geometric features. This study aimed to quantify the height, canopy volume, tree [...] Read more.
The geometric feature characterization of fruit trees plays a role in effective management in orchards. LiDAR (light detection and ranging) technology for object detection enables the rapid and precise evaluation of geometric features. This study aimed to quantify the height, canopy volume, tree spacing, and row spacing in an apple orchard using a three-dimensional (3D) LiDAR sensor. A LiDAR sensor was used to collect 3D point cloud data from the apple orchard. Six samples of apple trees, representing a variety of shapes and sizes, were selected for data collection and validation. Commercial software and the python programming language were utilized to process the collected data. The data processing steps involved data conversion, radius outlier removal, voxel grid downsampling, denoising through filtering and erroneous points, segmentation of the region of interest (ROI), clustering using the density-based spatial clustering (DBSCAN) algorithm, data transformation, and the removal of ground points. Accuracy was assessed by comparing the estimated outputs from the point cloud with the corresponding measured values. The sensor-estimated and measured tree heights were 3.05 ± 0.34 m and 3.13 ± 0.33 m, respectively, with a mean absolute error (MAE) of 0.08 m, a root mean squared error (RMSE) of 0.09 m, a linear coefficient of determination (r2) of 0.98, a confidence interval (CI) of −0.14 to −0.02 m, and a high concordance correlation coefficient (CCC) of 0.96, indicating strong agreement and high accuracy. The sensor-estimated and measured canopy volumes were 13.76 ± 2.46 m3 and 14.09 ± 2.10 m3, respectively, with an MAE of 0.57 m3, an RMSE of 0.61 m3, an r2 value of 0.97, and a CI of −0.92 to 0.26, demonstrating high precision. For tree and row spacing, the sensor-estimated distances and measured distances were 3.04 ± 0.17 and 3.18 ± 0.24 m, and 3.35 ± 0.08 and 3.40 ± 0.05 m, respectively, with RMSE and r2 values of 0.12 m and 0.92 for tree spacing, and 0.07 m and 0.94 for row spacing, respectively. The MAE and CI values were 0.09 m, 0.05 m, and −0.18 for tree spacing and 0.01, −0.1, and 0.002 for row spacing, respectively. Although minor differences were observed, the sensor estimates were efficient, though specific measurements require further refinement. The results are based on a limited dataset of six measured values, providing initial insights into geometric feature characterization performance. However, a larger dataset would offer a more reliable accuracy assessment. The small sample size (six apple trees) limits the generalizability of the findings and necessitates caution in interpreting the results. Future studies should incorporate a broader and more diverse dataset to validate and refine the characterization, enhancing management practices in apple orchards. Full article
(This article belongs to the Special Issue Exploring Challenges and Innovations in 3D Point Cloud Processing)
Show Figures

Figure 1

32 pages, 46700 KiB  
Article
Material Visual Perception and Discharging Robot Control for Baijiu Fermented Grains in Underground Tank
by Yan Zhao, Zhongxun Wang, Hui Li, Chang Wang, Jianhua Zhang, Jingyuan Zhu and Xuan Liu
Sensors 2024, 24(24), 8215; https://doi.org/10.3390/s24248215 - 23 Dec 2024
Viewed by 873
Abstract
Addressing the issue of excessive manual intervention in discharging fermented grains from underground tanks in traditional brewing technology, this paper proposes an intelligent grains-out strategy based on a multi-degree-of-freedom hybrid robot. The robot’s structure and control system are introduced, along with analyses of [...] Read more.
Addressing the issue of excessive manual intervention in discharging fermented grains from underground tanks in traditional brewing technology, this paper proposes an intelligent grains-out strategy based on a multi-degree-of-freedom hybrid robot. The robot’s structure and control system are introduced, along with analyses of kinematics solutions for its parallel components and end-effector speeds. According to its structural characteristics and working conditions, a visual-perception-based motion control method of discharging fermented grains is determined. The enhanced perception of underground tanks’ positions is achieved through improved Canny edge detection algorithms, and a YOLO-v7 neural network is employed to train an image segmentation model for fermented grains’ surface, integrating depth information to synthesize point clouds. We then carry out the downsampling and three-dimensional reconstruction of these point clouds, then match the underground tank model with the fermented grain surface model to replicate the tank’s interior space. Finally, a digging motion control method is proposed and experimentally validated for feasibility and operational efficiency. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

Back to TopTop