1. Introduction
With the advancement of welding automation and intelligence, computer vision-based monitoring of the welding process has been increasingly studied and applied. Real-time monitoring plays a critical role in ensuring the quality and stability of welding processes, particularly across various welding techniques such as laser welding [
1], butt fusion welding [
2], arc welding [
3], and friction stir welding [
4]. To ensure the stability of the welding process, the shape and dimensions of the molten pool are commonly used as key monitoring parameters [
5,
6]. While computer vision techniques have demonstrated high accuracy in feature recognition and extraction under standard conditions, the presence of interference factors, such as welding fumes, spatter, and intense arc light, significantly degrades the performance of control systems and increases the asymmetry of the molten pool [
7]. This interference leads to instability in methods that rely solely on variations in molten pool images for control, causing significant asymmetry in the molten pool, disrupting the process, and making it challenging to maintain consistent weld quality [
8,
9]. Therefore, developing a specialized welding process control system capable of robust molten pool monitoring under such interference conditions is crucial for maintaining symmetry and stability throughout the welding process.
Interference factors such as welding fumes, spatter, and intense arc light have long been a focal point in research on image-based welding monitoring technologies [
10,
11,
12]. Although various methods have been proposed to address these challenges, they still fail to meet the accuracy and robustness requirements for molten pool recognition and extraction. The primary limitation of these methods lies in the highly variable nature of interference characteristics: welding fumes exhibit temporal variation and persistence, spatter is highly random, and intense arc light demonstrates adhesion and randomness. Consequently, conventional vision-based approaches often rely on dedicated modules tailored to handle each type of interference. These methods typically involve complex nested rules, such as adaptive threshold segmentation and combined filtering techniques. For example, filtering techniques such as Gaussian and median filters remove noise from the images [
13]. Despite their ability to handle specific types of interference, these methods are often computationally expensive and require fine-tuning for different welding environments [
14]. Furthermore, they are generally more susceptible to dynamic interference, such as sudden changes in arc light intensity or highly erratic spatter, which can still lead to misclassification of molten pool features [
15]. Thus, while traditional vision-based approaches have been widely used, their robustness under varying real-world conditions remains a significant challenge [
16]. In contrast, data-driven machine vision methods have demonstrated superior performance in welding process monitoring, offering higher recognition accuracy and greater robustness against interference [
17,
18]. These methods typically leverage large volumes of historical data, where molten pool features and various interference patterns—such as fume-induced blurring or spatter-induced artifacts—are explicitly labeled and categorized [
19]. By learning from this annotated data, machine learning models can capture complex relationships between visual features and welding conditions, thereby improving their resistance to disturbances during real-time monitoring [
20]. However, despite their advantages in recognition accuracy, data-driven methods often suffer from lower precision in molten pool dimension extraction due to inherent limitations in feature modeling [
21]. To address this trade-off between recognition accuracy and extraction precision, this study proposes a hybrid molten pool dimension extraction method that integrates rule-based and data-driven vision techniques, improving extraction accuracy while maintaining high recognition performance.
In the context of welding process control based on molten pool image monitoring, the length and width of molten pool are commonly used monitoring variables, while welding speed serves as the primary control parameter [
22,
23]. Existing control methods typically adjust welding speed based solely on changes in molten pool monitoring variables, such as the length and width of the molten pool, meaning that speed modifications are directly determined by the deviation of the current molten pool dimensions from the target values [
24]. However, this approach does not account for the potential instability of the welding process when rapid changes in speed occur, which can destabilize the molten pool and result in defects such as spatter or incomplete fusion [
25,
26]. Welding stability is influenced by multiple factors, and to achieve uniform weld quality, rapid speed adjustments should be avoided when stability is low [
27,
28]. Since the stability of molten pool size variations is closely correlated with the overall welding process stability, this study introduces a novel approach that simultaneously monitors molten pool variations and evaluates their stability to characterize the welding process state. The goal of this study is to develop a method that not only monitors the molten pool’s size but also evaluates its stability, leading to dynamic adjustments in welding speed based on real-time stability assessments, ensuring improved weld quality.
In this study, a novel two-stage welding process control system is developed by integrating rule-based and data-driven vision techniques. In the first stage, an advanced molten pool dimension extraction method is proposed to enhance extraction precision while maintaining recognition accuracy. In the second stage, a vision-based control strategy for welding speed control is introduced, where welding stability is innovatively incorporated into speed control, a consideration not addressed in previous methods. By introducing a speed adjustment factor, the system achieves adaptive speed regulation based on welding process stability. Finally, an experimental platform is constructed to validate the proposed approach. Additionally, the proposed speed control strategy effectively adjusts welding speed based on process stability, leading to improved weld uniformity and enhanced welding process stability. From an industrial perspective, this method could be applied to various welding environments to optimize speed control, ensuring higher consistency and quality in automated welding processes.
2. Molten Pool Extraction Combined Rule-Based and Data-Based Method
The molten pool images captured during monitoring contain information about the shape and size of the molten pool. In this study, the length and width of molten pool are used as the basis for speed control. The image processing process is divided into two parts: recognition and positioning, and size extraction.
2.1. Molten Pool Recognition and Positioning
During welding, visual image monitoring is commonly affected by interference factors such as welding fumes, spatter, and intense arc light, which can obscure molten pool information. The three typical interference characteristics and a normal molten pool image are shown in
Figure 1. These interferences introduce specific challenges to molten pool recognition and positioning:
Welding fumes interference causes blurring in the image but does not completely obscure the molten pool. Its variation over time introduces dynamic challenges in tracking the molten pool shape.
Spatter interference introduces random and intense disturbances that block key features of the molten pool, leading to difficulties in accurate recognition and edge extraction.
Intense arc light interference results in high intensity, which can saturate the image, obscuring the molten pool’s boundaries and causing confusion in its shape recognition.
These interferences, which often occur alternately or simultaneously during welding, significantly complicate molten pool recognition. Therefore, in order to achieve accurate recognition and positioning, this study employs a data-driven method that utilizes real-world welding images with multiple interference factors as a training dataset to enhance feature detection and improve recognition performance.
R-CNN utilizes the selective search algorithm to generate region-based information at different scales from the input image. It then trains a CNN to extract features from the region information and uses a classifier to categorize the detected features. Boundary regression is used to refine the position of the bounding box [
20]. R-CNN performs classification before recognition, which allows it to require less data compared to other end-to-end methods. It can effectively recognize molten pool regions from monitoring images. The algorithm principle is shown in
Figure 2. R-CNN is chosen to define molten pool regions because it provides a good balance between recognition performance and stability. Given that the molten pool’s features are relatively simple, the recognition task is not highly complex. R-CNN is selected due to its ability to achieve reliable results with a smaller amount of training data compared to other methods, making it a suitable choice for this application.
R-CNN is used to identify molten pool regions, and various welding process molten pool monitoring images under different noise interference conditions are collected to train the model, ensuring the robustness. To improve the generalization ability of the model and prevent overfitting, the direction and location of features in the images are randomly altered, the images are scaled randomly, and random disturbances are added to the images using Gaussian noise. The results after applying these image augmentation methods are shown in
Figure 3.
During recognition, the data-driven method operates based on sample data. Therefore, as long as the features are present in the sample data, the molten pool can be recognized.
2.2. Molten Pool Size Extraction
The shape and size of the molten pool are contained within the bounding box region identified by R-CNN. In this study, a rule-based visual image processing technique is used to extract the dimensions of molten pool. First, within the identified region, the molten pool is the brightest part, corresponding to the highest gray level. Thus, the Otsu algorithm is used to segment the points within the identified region. The Otsu algorithm calculates the midpoint of the grayscale histogram and uses this midpoint as the optimal threshold for binary image processing. After this step, the points in the bounding box region are either black or white. The largest white area is then identified, and this region corresponds to the shape of molten pool.
Due to interference from spatter or intense arc light during the welding process, the molten pool area may become attached to high-brightness regions formed by spatter or arc light. Considering that the shape of the attached contours significantly differs from an interference-free molten pool, and that this feature is influenced by various complex factors, this paper designs extraction rules based on the molten pool shape. Through qualitative analysis, the molten pool is found to have an elliptical shape, with the left side being more elongated. Regarding the interference, spatter typically appears as small protrusions along the molten pool’s edge, while intense arc light tends to accumulate in large, expanded regions at the front of the molten pool. These interferences, when reflected in the image, lead to random, non-uniform local expansions and erosions, causing irregular protrusions or indentations along the edges. The large-area expansion is concentrated on the right side of the image, while the rest of the expansion and erosion occur to a lesser extent.
Based on this analysis, extraction rules are designed. First, the edges on the upper-left and lower-left sides of the molten pool are extracted and then smoothed to eliminate the protrusions and indentations on the left side of the image. Next, the furthest feasible point on the right side of the image is identified, and this point is used as the rightmost position to fit an elliptical edge that is tangent to the two smoothed edges on the left. The feasible point is defined as a point on the right edge of the image that is near the molten pool’s lateral centerline, and it serves as the point that, when used to generate the rightmost elliptical edge, does not overlap with the black region. After processing, the molten pool region’s edge will no longer be affected by interference.
Next, the covariance matrix of all points’ positions within the molten pool region is computed, and the feature vector corresponding to the direction closest to the welding torch’s movement is considered as the welding direction. The longest distance along the welding direction within the molten pool region is considered the length of molten pool. Subsequently, two parallel lines are set along the welding direction, making these lines the outermost parallel lines of the molten pool region. The maximum distance between these two parallel lines represents the width of molten pool.
3. Welding Speed Control Method Considering Welding State Stability and Two-Stage Welding Control System
In this section, a welding speed control method is proposed that considers molten pool size variations and welding state stability. The stability of the welding state is innovatively incorporated into speed control and reflected in the adjustment of speed via a speed adjustment factor. The stability of molten pool size variations is used as an indicator of welding state stability. Furthermore, based on the proposed visual extraction method and control strategy, a two-stage welding speed control system is developed.
3.1. Welding Speed Control Method Based on Molten Pool Size Variations and Welding State Stability
In the previous section, molten pool size information during the welding process was extracted. Considering the fidelity of the information contained in molten pool images, this study uses the length and width of the molten pool as the basis for speed control, thereby establishing a relationship between the changes in molten pool length and width and changes in welding speed. This relationship can be represented in Equation (1):
where
represents the speed adjustment,
and
represent the changes in molten pool length and width, and
and
represent the target values for molten pool length and width. In welding, assuming all other parameters remain constant, an increase in welding speed results in a decrease in molten pool size, meaning that an increase in welding speed leads to a reduction in molten pool size. Thus, to maintain a constant molten pool size, the relationship
represents a control method that adjusts the output based on the input values. As described previously, this relationship represents a classical speed control strategy that adjusts the speed based solely on the difference in molten pool size, ignoring the current welding state.
In this study, the stability of molten pool size variations is used to assess the stability of the welding process. The length and width of the molten pool are treated as one-dimensional vectors that change over time. The variance in these dimensions over a specified range of data is calculated and used to represent stability. More specifically, the variance is calculated for both the length and the width of the molten pool in a moving window of size
(representing the range of data considered for stability evaluation), as expressed in Equation (2):
where
represents the welding state stability at the current position,
and
are the weights for molten pool length and width,
is the index corresponding to the current welding position, and
is the range used to calculate the variance. The variance for each dimension (length and width) reflects how much the molten pool size fluctuates, with higher variance indicating less stability.
To link welding state stability with the speed adjustment, a speed adjustment factor
is introduced. When the molten pool dimensions fluctuate significantly, the variance is larger, leading to a larger value of
, which necessitates reducing the speed change rate to avoid destabilizing the molten pool due to rapid speed adjustments. Therefore,
is inversely proportional to the welding state stability. The factor
is expressed in Equation (3):
where
represents the proportional relationship between welding state stability and the speed adjustment factor.
is then incorporated into the speed adjustment calculation to adjust welding speed based on molten pool size variations and welding state stability, as shown in Equation (4).
This forms the control strategy for the speed control system.
3.2. Two-Stage Welding Control System Based on Visual Monitoring
Based on the proposed visual extraction method and control strategy, a two-stage welding speed control system is developed. The overall control process of the system is shown in
Figure 4, with all operations completed within a single control cycle. The two functional stages of the system work sequentially to ensure precise adjustment and stability of welding speed.
In the first stage, the proposed molten pool size extraction method, combining rule-based and data-driven visual techniques, is applied. Specifically, this method uses an efficient visual recognition algorithm to extract molten pool size information during the current control cycle. Through image processing and computer vision techniques, the system can obtain real-time data on location and size of the molten pool from the welding site, providing precise data support for subsequent control processes. The goal of this stage is to the ensure high-accuracy capture of molten pool dimensions and to provide reliable input data for welding speed adjustment.
In the second stage, based on the molten pool size extracted in the first stage, the system applies the proposed welding speed method based on molten pool size variations and welding state stability. The system evaluates the stability of the welding process based on the real-time changes in molten pool length and width and dynamically adjusts the welding speed. Specifically, the system first calculates the changes in molten pool length and width and, based on these changes and the stability of the welding state, derives the speed adjustment. The core of this process lies in dynamically adjusting the welding speed by analyzing changes in molten pool size and stability to ensure the stability and quality of the welding process.
The design of the control system considers the deep integration of visual data and control strategies, fully utilizing the correlation between molten pool size changes and welding state stability to form a precise, efficient, and adaptive control mechanism. By monitoring molten pool size and adjusting welding speed in real-time, the system can precisely control the welding process, ensuring stable weld quality and significantly improving production efficiency.
4. Experiments and Discussion
4.1. Experimental Setup
The welding and molten pool monitoring system used in the experiments consists of a six-degree-of-freedom welding robot, a 3D vision sensor (SR7300, SSZN, Shenzhen, China), and a camera. The 3D vision sensor is used solely for pre-welding inspection. The welding robot consists of six Kollmorgen RGM joint modules (RGM14, RGM20, RGM25, Kollmorgen, Radford, VA, USA) and a CO
2 gas shielded welding machine (TDN 5001MB, Time Group Inc., Beijing, China). During the welding process, the robot is equipped with an industrial camera (MV-CH120-10 GM, Hikvision, Hangzhou, China), which is fitted with a 1064 nm ± 10 nm optical filter and a 50% filter to enable real-time molten pool image feedback. The camera captures real-time images of the molten pool during welding, which are then transmitted to the system for further analysis. The images captured by the camera are processed through the optical and neutral density filters to effectively remove unwanted light noise under strong lighting conditions, enhancing the clarity and accuracy of the molten pool images. The experimental platform and data acquisition process, along with the control process, are shown in
Figure 5.
To achieve precise molten pool monitoring, the system evaluates the morphological features of molten pool in real-time by combining the molten pool images from the camera with welding state parameters. By analyzing the morphological changes of the molten pool in real-time, the system can accurately capture the changes in the welding state and provide real-time monitoring results. These results not only help monitor the stability of the welding process but also provide the necessary data for adjusting subsequent welding parameters. Ultimately, the system generates a real-time visual model of the molten pool’s morphology and provides data support for optimizing the welding process.
4.2. Accuracy Verification of Molten Pool Size Extraction Method
After data augmentation, approximately 4000 molten pool images are collected, and the regions containing molten pools are annotated. The molten pool images are randomly split into training and test sets at a 6:4 ratio. The batch size for training parameters is 64, and the maximum number of training steps is 200. The loss function and accuracy curves during training are shown in
Figure 6. The
x-axis represents the training time, and the
y-axis represents the error and accuracy between the current network output and the actual result. After 200 iterations, the model has largely converged.
Additionally, The complete molten pool size extraction process proposed in
Section 2.2 is shown in
Figure 7.
To validate the accuracy of the proposed molten pool extraction method combing rule-based and data-based vision, an accuracy verification experiment is designed. Several molten pool images, captured during actual welding processes with interference, are used for testing. The test results are presented in
Table 1, which includes the original monitoring images, the types of interference in the images, and the molten pool recognition, shape extraction, and size extraction results during the method processing.
The results demonstrate that the proposed method exhibits strong adaptability to welding fumes, spatter, and strong arc light interference, and it performs well even under combinations of these interferences. This can be attributed to the hybrid approach, which combines both data-driven and rule-based methods. The data-driven method is initially applied to recognize the molten pool in the raw data, and both methods are used to extract features within the recognized area. As a result, the proposed method benefits from the recognition accuracy of the data-driven approach and the extraction precision of the rule-based method.
Furthermore, the experimental results show that the proposed method also demonstrates strong adaptability to molten pool image rotation. This is due to the data augmentation performed during training, and the independence of extraction method from the absolute position of features. Therefore, the proposed method exhibits good rotational invariance.
4.3. Accuracy Verification of Welding Speed Control Method
To verify the effectiveness of the proposed method, a weld seam formation size control experiment is conducted. The weld seam formation size without speed control is used for comparison. During the welding experiment, the welding current and voltage are considered as interference inputs, so the current and voltage are not fixed, leading to unknown variations in the welding control method. Welding speed control is initiated at a specific moment, and prior to control, the welding speed remains at its initial value. When the welding speed control begins, a target molten pool size is set, and the control system uses the molten pool image inputs to extract the molten pool size using the method proposed in this paper. Based on Equation (4), the system adjusts the welding speed to keep the molten pool width at the target value. After welding, the weld seam width is measured, and the experimental results are shown in
Figure 8.
The figure shows the results after applying the proposed control method, compared to the results without speed control. The results indicate that after applying the proposed control method, the weld seam width gradually converges to the target width. The experimental results demonstrate that the proposed control method successfully achieves control of weld seam formation size, validating the effectiveness of the method.
Subsequently, to compare the performance of the proposed method based on Equation (4) with other speed control methods, a comparative experiment on welding speed control is performed. A negative feedback control system based on the molten pool size variation and the target size difference is used for comparison. In the welding experiment, both welding speed control systems are initiated at the same moment. Before control, the states of both welding systems are identical, and the target molten pool size setting is the same after the control began. After control initiation, both systems use the molten pool size extraction method proposed in this paper to obtain the required control input data and adjust the welding speed based on their respective control methods. After welding, the final weld seam formation sizes of both systems are measured, and the welding speeds during the process are recorded. The final weld seam formation sizes of the two systems are similar, indicating that both control methods have comparable performance in controlling weld seam formation size. The welding speeds during the welding process for both methods are presented in
Figure 9.
The experimental results show that the proposed welding speed control method, which considers welding state stability, offers higher control stability compared to other methods. The stability of welding speed directly reflects the stability of the welding process. To better illustrate the performance of the proposed method, the speed control stability of both methods is quantitatively analyzed. The variance is introduced as a commonly used indicator for representing the stability of sequence data. The variance of the speed control process for both methods is calculated. The results show that the variance of the proposed method is 0.00083, while the variance for the comparison method is 0.0012, indicating that the performance of proposed method is approximately 45% better. This result demonstrates that the proposed control method achieves superior welding process stability while still meeting the requirements for weld seam formation size control.