1. Introduction
With the trend of integration and miniaturization in electronic products, chip components have emerged, giving rise to the reflow soldering process, which quickly replaced the traditional manual soldering method in production processes. However, this advancement also brings complexity to process design and involves numerous influencing factors. As process technology continues to develop, high reliability and robustness have become the industry’s trends, especially in fields such as aerospace, aviation, and missile applications, where stringent operating environments require the precise design of the reflow soldering process window to meet high-quality soldering requirements. For most electronic products, the yield rate is the primary concern in the production process. Achieving high reliability and robustness in the reflow soldering process is crucial to ensure the optimal performance and functionality of electronic devices.
In most practical production processes, the process design commonly adopts a limited number of experiments and adjustments of process parameters, known as the “trial and error” method, to determine the optimal process parameters. However, this approach is inherently flawed, as it involves long design cycles and high costs. Additionally, it fails to account for the impact of uncertain factors such as material fluctuations and environmental loads, making it challenging to guarantee a high yield rate for products. In recent years, there have been numerous publications focusing on the analysis of the reflow soldering process. Esfandyari et al. [
1] developed a simulation model using the finite element method to study the optimization relationship between process parameters, simulated thermal processes, and porosity in solder joints of a reflow oven under pressure. Khatibi et al. [
2] conducted finite element simulations of PbSnAg solder using a stress rate and pressure-related material model to analyze its stress and strain states under static and cyclic loading conditions. Li Z.H. et al. [
3] employed molecular dynamics to investigate the growth of intermetallic compounds (IMCs) between low-silver composite solders and copper substrates during the reflow soldering process. They explored the factors influencing copper atom diffusion rates, and explained the mechanisms behind the addition of CeOz nanoparticles in copper atom diffusion behavior and interfacial IMC growth. Long Xu et al. [
4] improved the Coffin–Manson model and utilized the finite element method to predict the fatigue life of SAC305 solder joints under temperature cycling coupled with an electrical current. They revealed the performance advantages of lead-free solder with a high yield strength, and ultimate strength in the thermal fatigue and current density aspects. It is evident that there has been considerable research on reflow soldering, particularly in terms of reliability, in response to the trends of high reliability and robustness.
The current research on the robustness of reflow soldering processes is relatively limited and mainly relies on empirical or semi-empirical designs. Chowdhury et al. [
5] utilized orthogonal experimental design to analyze the main factors causing variations in lead-free processes and optimized the solder wetting with the goal of achieving the best combination of strength through thermal cycling verification. Kung Chieh et al. [
6] employed orthogonal experimental design to perform robust design analysis on the thermal-mechanical reliability of MCM (multi-chip module) packaging with flip-chip technology. They identified the substrate’s CTE (coefficient of thermal expansion) as the most significant factor influencing the reliability of coating fatigue and obtained the optimal combination parameters, resulting in a 554.5% increase in fatigue life under the best combination parameters. Chun-Sean Lau et al. [
7] used the grey Taguchi method to optimize the thermal stress and cooling rate of BGA solder joints, determining the optimal parameter settings for multiple performance characteristics and validating the effectiveness of the optimization results through simulation experiments. Zhou Jicheng et al. [
8] applied robust design and finite element methods, focusing on the thermal-mechanical fatigue life of solder joints. Considering eight control factors, they employed experimental design to optimize PBGA (plastic ball grid array) solder joints and obtained the optimal combination scheme. Tsai Tsung-Nan et al. [
9] proposed a Taguchi method based on fuzzy logic to optimize the fine-pitch steel mesh printing process. They used multiple response optimization and analysis of variance (ANOVA) to determine the significant factors and compared the optimization performance with two hybrid methods. Sridhar Canumalla [
10] introduced a response surface method to address advanced packaging reliability and robustness design issues. They described the descending life response function derived from the strain energy principle and presented a systematic approach to solve packaging reliability problems at the system level. Fupei Wu et al. [
11] presented a robust positioning method for PCB solder joints, which offers higher precision and efficiency for smaller-sized positioning by eliminating uncertain factors. From the existing literature, it is evident that experimental design methods have been used to achieve robust design in reflow soldering processes. However, for aerospace electronic products with stringent requirements, the current methods still have significant limitations. Additionally, the influence of some important factors such as the temperature of the preheating zones and the material properties of the PCBA components on the robustness of the reflow soldering process have not been investigated, mainly due to the cost of the experimental design methods. Therefore, there is an urgent need for modern robust optimization methods that consider input fluctuations and ensure output consistency in the design of reflow soldering processes to meet the development trends of high reliability and robustness in electronic products.
This study adopts the 6σ criterion and considers the fluctuations in the temperature parameters of each zone in the reflow oven around their set values, as well as the influence of material characteristics’ fluctuations of PCBA components on the process. By selecting relevant influencing factors, a robustness evaluation module based on surrogate models and an embedded robustness optimization analysis module are constructed. This enables the automatic search for optimal robust design solutions, taking into account the impact of uncertain factors during the initial design stage. The proposed method offers a low-cost, efficient, and high-quality optimization approach, making it highly valuable for improving the yield rate in practical engineering applications.
2. Establishing Accurate Simulation Model
Reflow soldering, as one of the key processes in SMT (surface mount technology) production, directly affects the soldering quality and reliability of electronic products [
12]. Its essence lies in the process of “heating”, which involves heating the air and using fans to deliver the heated airflow to the soldering surface. This action melts the solder paste and forms solder joints between surface-mounted components and the PCB without altering the original characteristics of electronic components [
13], as shown in
Figure 1, the schematic diagram of the reflow soldering process. The core of this process involves designing temperature profiles and furnace temperature settings. The temperature profile refers to the “temperature-time” curve established for the representative packages and solder paste used in soldering PCBA. It also includes the “temperature-time” curve for testing points on the PCBA. Only within a reasonable process window can the product achieve high reliability. On the other hand, furnace temperature settings refer to the temperatures set by operators for each zone’s built-in thermocouples on the control panel, according to the designed temperature profile.
During the production process, the temperature profile at the solder joints is either equal to or fixed at a certain difference from the furnace temperature setting. As they are mainly influenced by factors such as the environment, materials, or structural dimensions, it is generally necessary to adjust process parameters through design. This includes temperature adjustments in various zones, conveyor belt speed, airflow, etc. [
14] to achieve the appropriate process requirements and ensure a higher yield of good-quality products.
To establish an accurate simulation analysis of the reflow soldering process, this paper utilizes the Ansys Icepak finite element simulation module, which effectively addresses the inconsistency between the temperature profile at the solder joints and the furnace temperature setting. By designing the specified temperature, the simulation can obtain the desired temperature profile at the solder joints, allowing for an assessment of the rationality of process parameters. This approach omits the calculation of convective heat transfer coefficients during the intermediate process, thereby reducing the errors introduced in the intermediate steps.
This paper focuses on a specific model of hot air reflow soldering oven as the research subject. The main objective is to establish an accurate simulation model for the thermal field of the reflow soldering process. This is achieved through three steps: first, the actual dimensions of the reflow oven cavity are measured; second, an equivalent geometric model of the real PCBA components is created; and third, an equivalent thermal property parameter model is established. By addressing the challenges of measuring certain parameters, this approach enables the construction of a precise simulation model for the hot air reflow soldering process.
2.1. Reflow Soldering Oven Chamber Size
The reflow soldering oven has eight front heating zones and two rear cooling zones, with each zone having a transition area between them. As shown in
Figure 2, based on actual measurements, the width of each individual reflow zone (W) is
, the length (L) is
, and the height from the outlet to the conveyor rail (H) is
. Each circular nozzle has a diameter (D) of
and is arranged in a regular hexagonal pattern. The horizontal spacing (D1) between the nozzles is
, and the vertical spacing (D2) is
. The distribution of the oven cavity is consistent both vertically and horizontally. The experimental conveyor speed is set at
.
2.2. PCBA Component Equivalent Model
This experiment focuses on a specific PCBA produced by a certain manufacturer. The printed circuit board consists of a material substrate with copper foil for electrical interconnections. The board integrates various packaged components, including ball grid arrays (BGAs), chips, and resistors. An equivalent geometric model of the PCBA components based on the actual components is established for the research. Considering the precision requirements for smaller-sized components that have a minimal impact on the temperature field, they are either neglected or treated with a simplified approach. Complex components such as BGAs are simplified using a thermal resistance network model to obtain the simplified model, as shown in
Figure 3, and the thermal material properties of each component are obtained and listed in
Table 1.
2.3. Parameter Correction
In Icepak simulation analysis, determining the temperature and airflow velocity of the nozzles is crucial. The temperature of the nozzles represents the set temperature, while the airflow velocity is a challenging parameter to measure accurately, yet it significantly affects the temperature field. To ensure the precision of the model within a deviation range of ±10 °C, establishing an accurate simulation model is essential. In the previous literature, the calculation of airflow velocity has been based on H. Martin’s formula [
15]. However, this approach has certain limitations [
16], and when applied to this experiment, it introduces considerable errors, as shown in
Figure 4. The simulated curve exhibits poor fitting to the measured curve, indicating significant discrepancies.
Therefore, this experiment adopts the method of temperature field correction based on established real measurement data, including four processes: building the reflow soldering process simulation model, determining the experimental design, constructing the response surface model, and optimizing and solving. Specifically, using aluminum alloy
plate as the test object, with the nozzle air velocity of 10 temperature zones as the variable, and minimizing the difference between the simulated curve and the measured curve at monitoring points as the objective, the optimized parameters of air velocity in each temperature zone are obtained through the response surface optimization method. The before and after correction, the error between the measured temperature curve and the simulated temperature curve is shown in
Figure 5. It can be observed that the error before correction is mostly around 66.7% greater than ±10 °C, and even reaches nearly 20 °C at the maximum deviation. However, after correction, the error is mainly within the range of (−5, 5) °C, indicating a significant improvement in the correction effect.
To ensure the reliability of the correction results, the actual printed circuit board assembly component mentioned above was used as the validation object. By comparing the measured results of the PCBA component with the corrected simulation results, the accuracy of the correction effect was verified. The final results are shown in
Table 2. It can be observed that the temperature differences between the two simulations and the measurements are all less than 6 °C, thus meeting the accuracy requirement.
4. Model Establishment of Surrogate Model
Due to the high computational cost of the original Icepak simulation model, as well as the consideration of noise factors and the influence of variable fluctuations in the subsequent robust optimization design, it is challenging to achieve. Therefore, a surrogate model was adopted to replace the original model [
23].
4.1. The Method for Constructing the Surrogate Model
Unlike existing surrogate models constructed using certain approximation methods such as polynomial regression or kriging, this study utilized optiSLang with the Metamodel of Optimal Prognosis (MOP) proposed by Most and Will [
24]. The fundamental idea of the MOP is to search for the optimal input variable set and the most suitable approximation model. It introduces an independent approach to evaluate the model quality, known as the Coefficient of Prognosis (CoP), which is defined as follows:
In the formula,
represents the sum of the squares of the prediction errors;
represents the total change in output;
represents the real output dataset;
represents the output dataset predicted based on the meta-model;
and
, respectively, represent the true and predicted variances. The measure
, which usually evaluates the approximate quality of the regression model, is defined as
The difference lies in the selection of the datasets. In simple terms, the calculation of the CoP values involves using data points that were not used to construct the approximation model. This means that the model quality is estimated only based on points that were not used in building the approximation model.
However, the MOP uses the prediction error rather than the fitting error, making it suitable for both regression and interpolation models. Moreover, to construct fast and reliable approximation models, linear or quadratic bases of polynomials or moving least squares approximation (MLS) were the preferred choices.
4.1.1. Polynomial Regression
Polynomial regression is a commonly used approximation method that has the advantages of simplicity in construction, low computational complexity, and fast convergence. The model’s response is typically composed of linear or quadratic polynomial basis functions, with or without interaction terms. Its basic theory is as follows:
In the equation, represents the actual response function, which is an unknown function. represents the approximate function of the response, which needs to be obtained through fitting. represents the error between the approximation and the actual response.
As a polynomial response model,
is represented as
where
is the polynomial basis function, defined as
The vector
is a collection of unknown regression coefficients, obtained through fitting the model to a set of sample points, assuming independent errors and variances for each observation. Using matrix representation, the least squares solution is given using
where
is the polynomial basis matrix containing the sample points, and y is the vector of sample point values.
4.1.2. Moving Least Squares (MLS) Approximation
In the moving least squares approximation, the local characteristics of regression were obtained by introducing position-dependent radial weighting functions. The basis functions can include any type of function, but typically only linear and quadratic terms are used. This basis function allows for an accurate representation by obtaining the best local fit to the actual interpolation points. The approximation function is defined as follows:
To obtain the local regression model in the MLS method, a distance-dependent weighting function
was introduced, where
represents the normalized distance between the interpolation point and the selected supporting point, given by using
where
is the influence radius, defined as a numerical parameter. All types of functions can be used as the weighting function
, and in most cases, the well-known Gaussian weighting function is employed:
where
is a numerical constant, and the final approximation function is represented as
where the diagonal matrix
contains the values of the weight function corresponding to
support points.
4.2. Establishing Response Surface
We selected appropriate variable ranges for the screened design variables. Due to considering subsequent response surface optimization analysis, some variable ranges needed to be appropriately expanded. Therefore, the variable ranges for T5~T9 were selected as follows: T5 (160–220), T6 (180–240), T7 (210–270), T8 (210–270), and T9 (80–150). The heating factor was chosen as the target value, with the reflow soldering process requirements as constraints, and considering the influence of noise factors. Through the established accurate temperature field simulation model, sample points were obtained using Latin hypercube sampling, and different proxy models were fitted using the above-mentioned methods. The quality of the approximate model was evaluated using the CoP and
, and a correlation analysis report was obtained. The proxy model with the highest fitting degree was selected to construct the response surface. The basic process is shown in
Figure 8.
The results of response analysis are presented in
Table 5. Since all constraints and objectives were ensured to be within the appropriate process window, it is necessary to consider the minimum and maximum values of each response.
4.3. Accuracy Verification
To ensure the accuracy of the response surface, it was necessary to perform accuracy validation on the constructed response functions. The specific approach involved comparing the optimization results based on the response surface with the results obtained from the original temperature field simulation model under the same parameters, ensuring that the error for each response was not greater than 5%. In this validation process, noise factors were not considered, and their values were set to fixed initial values. Particle swarm optimization (PSO) [
25] was employed for a quick local search on the response surface approximated using the proxy model. A gradient-based optimization module was constructed using the response surface, and the optimal set of temperatures for T5 to T9 was obtained, as shown in
Table 6.
Using the optimal set temperatures (T5~T9) obtained through the response surface optimization, we input them into the original temperature field simulation model to obtain the corresponding response values, as shown in
Table 7.
As shown in the table above, the deviations between the constraints and objectives obtained from the optimization based on the response surface and the simulation results of the original temperature field were all less than 5%. Therefore, the accuracy requirements were met.
5. Robust Optimization Design
Robustness refers to the insensitivity of the dependent variable (outcome or response) to small variations in the factors (causes or inputs). In simple terms, it involves analyzing the probability distribution of a product’s design standard. Robustness primarily focuses on the properties of the probability density curve near its mean, aiming to reduce sensitivity to variations in material properties and loads. This differs from reliability, which considers the tail properties of the probability density curve and requires the standard to be greater than the safety indicator. While these two concepts have differences, they also share certain connections.
5.1. 6σ Design Theory
To improve the yield of batch products, this study adopts the 6
robust design method, considering the influence of noise factors and control factor fluctuations on product quality. The concept of 6
originated from the field of quality engineering, and it refers to raising the probability of producing qualified products to a 6
level by measuring, analyzing, and controlling the impact of uncertainty factors on product quality [
26,
27]. The purpose of 6
design is to ensure that the mean of the objective function and design parameters still meet the constraint boundaries within the ±6
range, achieving a product qualification rate of up to 99.9999998%. Even with the wear of processing molds causing a deviation of 1.5
in product manufacturing accuracy, the defective rate is only 0.00034% [
28].
Applying the robust design based on 6
to the reflow soldering process serves two main purposes: firstly, to analyze the robustness of batch products under a specific process scheme and establish a robustness evaluation module; secondly, to embed the robustness evaluation module into the optimization analysis model, constructing a robust optimization analysis module. Through automated optimization, the most robust process scheme was obtained. The basic definition is as follows:
In the equation, represents the design variables, represents the objective function, represents the constraint conditions, represents the mean of the response, represents the variance of the response, and and are the lower and upper bounds of the constraint conditions.
5.2. Robustness Evaluation Module
The above fitted response functions were used to replace the original temperature field simulation model, and a robustness analysis module based on the response surface was constructed. Parameters such as noise factors
,
,
, and control variables T5~T9 were set with the parameter type, distribution type, and coefficient of variation (CoV). A total of 100 samples were generated using the ordinary Monte Carlo random sampling method to complete the initialization of the robustness assessment module. By connecting the gradient-based optimization module with the robustness assessment module based on the response surface, the optimized process parameters (
Table 6) were subjected to robustness evaluation. First, the cumulative distribution functions (CDF) of each response were analyzed to determine if they followed a normal distribution.
Figure 9 shows the distribution of
.
It can be observed that the distribution function of
is approximately normal, indicating that the probability distribution of
follows a normal distribution. Similarly, the analysis shows that other constraint conditions also satisfy the normal distribution. The final results of the robustness evaluation are presented in
Figure 10.
The specific
levels and failure probabilities are shown in
Table 8.
From
Table 8, it can be observed that although the mean values of the optimized process parameters based on the response surface optimization are close to the optimal values, and the maximum coefficient of variation (CoV) is relatively small,
,
, and
fail to achieve the 6
level. Additionally,
and
have high failure probabilities and low
levels. Therefore, in the optimization design scheme, without considering noise factors and fluctuations in control variables in practical production, it may result in some products failing to meet the desired process requirements. When deviations are significant, the final outcome may lead to product failure.
5.3. Embedded Robustness Optimization Analysis Module
By embedding the robustness evaluation module into the newly established optimization analysis module, data interaction is achieved. The adaptive optimization method is employed to reconfigure the constraint conditions and complete the configuration of the optimization analysis module.
The final result of the most robust solution is shown in
Figure 11.
Analyzing the robustness of the optimization results, since the aforementioned
, and
did not reach the 6
level, the following mainly focuses on the robustness analysis of the responses
, and
. The specific analysis results are shown in
Figure 12 and
Table 9.
From
Table 8 and
Table 9, it can be observed that the failure probabilities and
levels significantly improved through robust optimization. The failure probability for the supercritical fluid line time decreased from 0.45 to 1.55 × 10
−10, while the
level increased from 0.12
to 6
. Similarly, the failure probability for the peak temperature decreased from 0.52 to 9.06 × 10
−11, and the
level increased from 0.06
to 6
. The results obtained from the robust optimization analysis demonstrate significant improvement in optimization performance, providing a substantial guarantee for the product’s yield rate.
5.4. Reliability Analysis and Verification
To verify the reliability of the optimized results and ensure that the influence of intermediate errors on the results is minimal, a reliability analysis was conducted on the results after robust optimization.
Firstly, a reliability analysis model was established to verify the intermediate errors. In this analysis, the reflow soldering process simulation model established in Chapter 2 was used as the original model. The noise factors and variables were reconfigured, and the extreme conditions and desired levels of the process requirements were set to initialize the reliability analysis. The robust optimization module was then connected to the reliability analysis module, and the resulting data from the robust optimization were transferred to the reliability analysis module for further analysis.
By performing reliability analysis calculations, a reliability analysis report was generated. The report indicated that the embedded robust optimization response had reached a level of 5.7
, with a failure probability of 8.6 × 10
−9 and a standard deviation error of 5.1 × 10
−9, thereby confirming the effectiveness of the optimization results. A scatter plot, as shown in
Figure 13, was obtained for the input variables T5 and T8—peak temperature
. The red areas in the plot represent the unsafe regions, clearly indicating the unsafe domain.