A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images
Abstract
:1. Introduction
1.1. Background
1.2. Related Work
1.3. Contribution
2. Materials and Methods
2.1. Framework
2.2. Two-Dimensional Feature Extraction
2.2.1. Object Detection
- Increased inference speed: optimized architecture and model quantization techniques in YOLOv8 contribute to significantly faster inference, making it well-suited for real-time applications on edge devices with limited processing power;
- Reduced model size: YOLOv8’s efficient design results in smaller model sizes compared with YOLOv5, requiring less memory and storage, which are critical factors for deployment on edge devices.
- Horizontal flips: mirroring the images horizontally to increase the variety of vertex orientations;
- Rotations: rotating images by ±15° to account for potential variations in camera orientation;
- Exposure adjustments: modifying the exposure by +10% and −10% to introduce robustness to different lighting conditions;
- Blurring: applying a Gaussian blur with a radius of up to 2.5 pixels to simulate slight defocusing or motion blur.
- Image resolution: The input images were resized to 640 × 640 pixels, which is a commonly used resolution for YOLO models. This ensured a consistent input size for the model.
- Epochs: the model was trained for 75 epochs, which is a relatively small number to mitigate overfitting due to the limited dataset size.
- Batch size: A batch size of 16 was used during the training. This means that the model was updated after simultaneously processing 16 images. This parameter was based on the available resources. A larger batch size can improve the training speed, but it might require more memory.
- Learning rate: The learning rate was set to 0.01. This parameter controls how much the model’s weights are adjusted during each training iteration. A higher learning rate can lead to faster convergence, but it might result in oscillations or overshooting the optimal weights.
- Optimizer: The stochastic gradient descent (SGD) optimizer was employed. The SGD is a classic optimization algorithm that iteratively updates the model’s weights based on the gradient of the loss function.
2.2.2. Semantic Segmentation
2.2.3. Pixel Calculation
- Image 1: the first image in each set shows the original image with the bounding box from the object detection stage, highlighting the detected vertex;
- Image 2: the second image shows the cropped area around the detected vertex, which was further processed by the semantic segmentation model;
- Image 3: the third image shows the result of the semantic segmentation, where the pixels corresponding to the vertex are highlighted in red;
- Image 4: the fourth image shows the final pixel location calculated from the semantic segmentation, as represented by a small white circle in the center of the segmented area.
2.3. Three-Dimensional Position of the Initial Weld Point
Reference Frames
3. Experiment Results and Discussion
3.1. Experimental Setup
3.2. Ground Truth Retrieval
- Downsampling: This reduces the density of the point cloud by grouping nearby points into 3D cubes (voxels), each with a side length of 2 mm. This process balances the accuracy and processing time by simplifying the point cloud while retaining relevant geometric information.
- RANSAC: The RANSAC (Random Sample Consensus) algorithm [27] was employed to extract planes from the point cloud. RANSAC works by iteratively selecting a minimal set of points (in this case, 500 points) to define a plane, and then evaluates the consensus of the remaining points against this plane. This process was repeated for a maximum of 100 iterations. A point was considered an inlier if its distance from the fitted plane was less than 3 mm. This robust approach effectively identified the dominant planes within the point cloud, even in the presence of noise and outliers.
- Plane intersection: Potential U-shaped structures were identified by verifying the near-perpendicularity of three extracted planes. To ensure a robust detection, we required that the planes formed angles close to 90 degrees, with a tolerance of 3 degrees. Additionally, we checked that each point within the point cloud was located within a maximum distance of 3 cm from the intersection points of all three planes. This combined approach effectively identified potential U-shaped structures by verifying both the geometric relationship between the planes and the spatial arrangement of the points within the point cloud.
- Vertex extraction: the intersection point of these three planes becomes the candidate vertex.
3.3. Two-Dimensional Feature Extraction Error
3.4. Three-Dimensional Distance Error
3.5. Processing Time
3.6. Lighting Conditions
4. Conclusions
- Cost-effectiveness: our method offers a significant cost reduction, up to 50 times lower compared with traditional RGB-D cameras, making it an attractive solution for industrial settings with budget constraints.
- Versatility: the use of 2D cameras provides flexibility and adaptability for various welding scenarios, as they are easier to integrate into existing robotic systems compared with more complex 3D sensing solutions.
- Robustness: our two-stage YOLO pipeline was found to be effective in identifying keypoints, particularly in scenarios with low-contrast images, such as those produced by grayscale cameras and dark metal surfaces.
- Efficiency: Initial tests, which were conducted in simulation using an ROS and structured robot data, demonstrated near real-time performance. Further optimization with GPU acceleration and reduced data saving can potentially enhance the speed and efficiency.
- We developed a methodology for identifying keypoints from 3D point clouds.
- We present a novel 2D image-based approach for detecting 3D keypoints, incorporating robot positioning information.
- We conducted a comparative analysis that demonstrated the potential benefits and limitations of both 2D and 3D approaches.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ROS | Robot operating system |
SAM | Segment Anything Model |
YOLO | You Only Look Once |
PCL | Point Cloud Library |
POI | Point of interest |
CUDA | Compute unified device architecture |
SGD | Stochastic gradient descent |
References
- Mandal, N.R. Ship Construction and Welding; Springer: Berlin/Heidelberg, Germany, 2017; Volume 2. [Google Scholar] [CrossRef]
- Lee, D.; Ku, N.; Kim, T.W.; Kim, J.; Lee, K.Y.; Son, Y.S. Development and application of an intelligent welding robot system for shipbuilding. Robot. Comput. Integr. Manuf. 2011, 27, 377–388. [Google Scholar] [CrossRef]
- Dinham, M.; Fang, G. Detection of fillet weld joints using an adaptive line growing algorithm for robotic arc welding. Robot. Comput. Integr. Manuf. 2014, 30, 229–243. [Google Scholar] [CrossRef]
- Zhang, L.; Xu, Y.; Du, S.; Zhao, W.; Hou, Z.; Chen, S. Point Cloud Based Three-Dimensional Reconstruction and Identification of Initial Welding Position. In Transactions on Intelligent Welding Manufacturing; Springer: Singapore, 2018; pp. 61–77. [Google Scholar] [CrossRef]
- Gao, J.; Li, F.; Zhang, C.; He, W.; He, J.; Chen, X. A Method of D-Type Weld Seam Extraction Based on Point Clouds. IEEE Access 2021, 9, 65401–65410. [Google Scholar] [CrossRef]
- Kim, J.; Lee, J.; Chung, M.; Shin, Y.G. Multiple weld seam extraction from RGB-depth images for automatic robotic welding via point cloud registration. Multimed. Tools Appl. 2020, 80, 9703–9719. [Google Scholar] [CrossRef]
- Yang, S.; Shi, X.; Tian, X.; Liu, Y. An Approach to the Extraction of Intersecting Pipes Weld Seam Based on 3D Point Cloud. In Proceedings of the 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS), Chengdu, China, 3–5 August 2022. [Google Scholar]
- Patil, V.; Patil, I.; Kalaichelvi, V.; Karthikeyan, R. Extraction of Weld Seam in 3D Point Clouds for Real Time Welding Using 5 DOF Robotic Arm. In Proceedings of the 2019 5th International Conference on Control, Automation and Robotics (ICCAR), Beijing, China, 19–22 April 2019. [Google Scholar]
- Wei, S.c.; Wang, J.; Lin, T.; Chen, S.b. Application of image morphology in detecting and extracting the initial welding position. J. Shanghai Jiaotong Univ. Sci. 2012, 17, 323–326. [Google Scholar] [CrossRef]
- Liu, F.; Wang, Z.; Ji, Y. Precise initial weld position identification of a fillet weld seam using laser vision technology. Int. J. Adv. Manuf. Technol. 2018, 99, 2059–2068. [Google Scholar] [CrossRef]
- Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. Automatic Detection and Location of Weld Beads With Deep Convolutional Neural Networks. IEEE Trans. Instrum. Meas. 2021, 70, 5001912. [Google Scholar] [CrossRef]
- Li, J.; Liu, Z.; Wang, J.; Gu, Z.; Shi, Y. A novel initial weld position identification method for large complex components based on improved YOLOv5. In Proceedings of the 2023 3rd International Conference on Robotics and Control Engineering, Nanjin, China, 12–14 May 2023. [Google Scholar]
- Stanford Artificial Intelligence Laboratory. Robotic Operating System. Available online: https://www.ros.org/ (accessed on 24 June 2024).
- Rusu, R.B.; Cousins, S. 3D is here: Point Cloud Library (PCL). In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011. [Google Scholar]
- Jocher, G.; Chaurasia, A.; Qiu, J. YOLO, Version 8; Ultralytics: Los Angeles, CA, USA, 2023.
- Ku, N.; Ha, S.; Roh, M.I. Design of controller for mobile robot in welding process of shipbuilding engineering. J. Comput. Des. Eng. 2014, 1, 243–255. [Google Scholar] [CrossRef]
- Shinichi, S.; Muraoka, R.; Obinata, T.; Shigeru, E.; Horita, T.; Omata, K. Steel Products for Shipbuilding; JFE Technical Report; JFE Holdings: Tokyo, Japan, 2004. [Google Scholar]
- Hussain, M. YOLO-v1 to YOLO-v8, the Rise of YOLO and Its Complementary Nature toward Digital Manufacturing and Industrial Defect Detection. Machines 2023, 11, 677. [Google Scholar] [CrossRef]
- Detect—Ultralytics YOLOv8 Docs. Available online: https://docs.ultralytics.com/tasks/detect/ (accessed on 24 June 2024).
- Dwyer, B.; Nelson, J.; Solawetz, J. Roboflow, Version 1.0; Roboflow: Des Moines, IA, USA, 2022.
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023. [Google Scholar]
- Tsai, R.Y.; Lenz, R.K. A New Technique for Fully Autonomous and Efficient 3D Robotics Hand/Eye Calibration. IEEE Trans. Robot. Autom. 1989, 5, 345–358. [Google Scholar] [CrossRef]
- OpenCV: Camera Calibration and 3D Reconstruction. Available online: https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html (accessed on 24 June 2024).
- Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar] [CrossRef]
- Dawson-Howe, K.M.; Vernon, D. Simple pinhole camera calibration. Int. J. Imaging Syst. Technol. 1994, 5, 1–6. [Google Scholar] [CrossRef]
- Ha, J.E. Automatic detection of chessboard and its applications. Opt. Eng. 2009, 48, 067205. [Google Scholar] [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random sample consensus. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Model | Size (Pixels) | 50–95 | Speed CPU ONNX (ms) | Speed A100 TensorRT (ms) | Params (M) | FLOPs b |
---|---|---|---|---|---|---|
YOLOv8n | 640 | 37.3 | 80.4 | 0.99 | 3.2 | 8.7 |
YOLOv8s | 640 | 44.9 | 128.4 | 1.20 | 11.2 | 28.6 |
YOLOv8m | 640 | 50.2 | 234.7 | 1.83 | 25.9 | 78.9 |
YOLOv8l | 640 | 52.9 | 375.2 | 2.39 | 43.7 | 165.2 |
YOLOv8x | 640 | 53.9 | 479.1 | 3.53 | 68.2 | 257.8 |
Process | Mean | Deviation | Min | Max |
---|---|---|---|---|
Novel 2D image | 308.472 | 19.349 | 272.404 | 357.010 |
3D point cloud | 6097.310 | 3503.454 | 679.888 | 15,828.210 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lopez-Fuster, M.-A.; Morgado-Estevez, A.; Diaz-Cano, I.; Badesa, F.J. A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images. Machines 2024, 12, 447. https://doi.org/10.3390/machines12070447
Lopez-Fuster M-A, Morgado-Estevez A, Diaz-Cano I, Badesa FJ. A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images. Machines. 2024; 12(7):447. https://doi.org/10.3390/machines12070447
Chicago/Turabian StyleLopez-Fuster, Miguel-Angel, Arturo Morgado-Estevez, Ignacio Diaz-Cano, and Francisco J. Badesa. 2024. "A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images" Machines 12, no. 7: 447. https://doi.org/10.3390/machines12070447
APA StyleLopez-Fuster, M.-A., Morgado-Estevez, A., Diaz-Cano, I., & Badesa, F. J. (2024). A Neural-Network-Based Cost-Effective Method for Initial Weld Point Extraction from 2D Images. Machines, 12(7), 447. https://doi.org/10.3390/machines12070447