1. Introduction
In recent years, optimization of various manufacturing processes using machine learning and evolutionary algorithms are becoming a key requirement for various industries. This is because these algorithms can save a significant amount of time, effort and material wastage within the industry by eliminating unnecessary testing. A number of researchers have developed computational models that can be used to predict and optimize the outputs of commonly used manufacturing processes. In most of the research, prediction models were developed for the process under consideration based on experimental data using techniques such as regression analysis, artificial neural networks, etc. After developing the prediction models, evolutionary algorithms can be effectively used in order to optimize the input parameters of that particular process. Welding processes are no exception to the application of such algorithms in order to predict and optimize the geometrical, microstructural and mechanical properties of the weldment even before welding commences on components.
Xiong, et al. [
1] applied an artificial neural network (ANN) and a second-order regression analysis for predicting the weld bead geometry using the robotic gas metal arc welding process. They found that the ANN performed better than the second-order regression model since it has greater capacity for approximating non-linear processes. Similar results were obtained by Lakshminarayan and Balasubramanian [
2], when they compared response surface methodology (RSM) with ANNs to predict the tensile strength of friction-stir welded aluminium joints. Many other researchers including [
3,
4,
5,
6] have also demonstrated an improvement in the accuracy of predicting the weld properties using ANNs. This demonstrates that ANNs are extremely efficient in predicting the geometrical as well as mechanical features of a weldment. Microstructural features have also been predicted with improved accuracy using ANNs. Vitek, et al. [
7] used an ANN to predict the ferrite number of austenitic stainless steel welds by considering the chemical composition and the cooling rate of the weld pool. This model had outperformed all the other internationally accepted methods for predicting the ferrite number at that time.
Evolutionary algorithms have been effectively used by many researchers to optimize the input parameters of various manufacturing processes. These evolutionary algorithms include genetic algorithms (GA), simulated annealing (SA), particle swarm optimization (PSO), firefly optimization (FO), ant colony optimization (ACO), etc. All these algorithms search for a solution within a given solution space by moving towards a better solution in every iteration in most of the cases.
Hu, et al. [
8] applied the PSO algorithm to various engineering problems. They slightly modified the original algorithm such that only feasible solutions are retained in the memory of the algorithm in order to constrain the solution space. They concluded that PSO is an efficient and general approach to solve most nonlinear optimization problems with inequality constraints. Katherasan, et al. [
9] also developed optimization models for a flux core arc welding process using PSO and ANNs in order to maximize the depth of penetration, minimize the bead width and reinforcement, which gave good results.
SA was used by Roshan, et al. [
10] to optimize a friction stir welding process in order to achieve desired mechanical properties of AA7075 welds. They found that the SA algorithm is capable of optimizing the welding process parameters to obtain the desired properties. A SA algorithm was also used by Tarng, et al. [
11] to optimize the process parameters to obtain the desired bead geometry. They further classified the welds based on bead geometry quality using a fuzzy clustering technique. Similarly, other researchers [
12,
13] have also used SA to optimize the welding process and found it to be very effective in predicting process parameters for the weld bead geometry.
Similarly to SA and PSO, GA has also been extensively used for optimization of parameters. Sathiya, et al. [
14] applied various optimization algorithms (GA, SA and PSO) in order to optimize friction-stir welding process parameters to obtain desired tensile strength and minimize metal loss. They found that among the algorithms they used, GA outperformed all the other algorithms and the results obtained from GA had a good agreement with the experimental data. Correia, et al. [
15] compared GA and RSM in order to optimize the welding process based on four quality responses (deposition efficiency, bead width, depth of penetration and reinforcement). They also found that GA can perform better than RSM; however optimization using GA requires a good setting of its internal parameters. GA was also used by Pashazadeh, et al., [
16] in order to optimize the electrode tip dressing operation for a resistance spot welding process, and found that it can be effectively used for optimizing the parameters.
All the algorithms mentioned above are capable of optimizing the welding process, but the methodology used by these algorithms, the computation effort (function evaluations) and time required to obtain a feasible solution vary significantly. Although these algorithms can reach the global minimum of an error function, the amount of computation effort and time required by them can be significant. This paper demonstrates a reduction in this effort and time by initially finding approximate solutions using any of the above mentioned algorithms and then applying the Nelder-Mead optimization (NMO) method to further refine them. NMO is an optimization method that is most effective when it is unconstrained, and hence cannot be applied directly to the optimization problem in many of the cases as it can lead to an infeasible solution.
3. Application of Algorithms and Results
3.1. Genetic Algorithm
Genetic algorithms were first developed by John Holland in 1975 inspired from principles of genetics and natural selection. These algorithms have been successfully applied by researchers to a variety of optimization problems. The general steps involved in the optimization process are shown in
Figure 4.
The genetic algorithms start with initializing a number of chromosomes (random solutions) collectively known as the initial population. The number of chromosomes in the population can have a significant impact on the results obtained as well as on the time and computational effort required to obtain the optimized solution. In this case after a few trial and error runs, the number of chromosomes was set to 10 since increasing them beyond this value only increased the computation effort without significantly reducing the final error level. Once the initial population is obtained, the error function for every chromosome is evaluated and the chromosomes are ranked according to their fitness based on the error (chromosomes with least error are the fittest). Elitism is a process of always selecting the best chromosome for next generation. In this case, out of the 10 chromosomes required in the next generation, the elite chromosome from the parent population was directly absorbed. The next six spaces were filled using a rank based selection method, whereas the last three spaces were filled by randomly generated chromosomes. From the six selected chromosomes, three pairs were formed for crossover operation with a crossover rate of 0.8. After crossover, some elements of these chromosomes were replaced (mutation) with a mutation rate of 0.3. All the operations including elitism, crossover and mutation, were repeated until the stopping criteria were met. Two stopping criteria were used to terminate the computation:
The computation is terminated when either of these criteria is met. These termination criteria were used only for the production runs i.e., the solutions obtained from which were fed in Nelder Mead optimization algorithm as discussed later in this section. They were removed from the algorithm during the trial runs done to understand the increase in number of function evaluations required on reducing the targeted maximum error, as illustrated in
Figure 5. The algorithm parameters used for GA optimization are mentioned in
Table 4.
Although GAs could have been used to further reduce the error mentioned above, the computation effort and time required would have significantly increased making its implementation impractical. The plot in
Figure 5 shows the number of ANN evaluations required to obtain a certain minimum error on a logarithmic scale. Every data point on the plot is taken as an average of 50 individual observations. This is because the number of ANN evaluations required by GA to reach an acceptable solution can vary significantly depending on the chromosomes obtained after crossover and mutation. It can be seen in
Figure 5 that the number of ANN evaluations required increase following a power law for errors down to roughly 0.01. Below this error level, the power law index jumps, dramatically increasing the number of function evaluations required. Below a certain targeted error level, the high number of function evaluations required to obtain the solution can make the implementation of this algorithm impractical. To overcome this problem, approximate solutions (targeted maximum error = 0.1) that require very few generations have been fed into NMO method which can reduce the error significantly with only a few additional ANN evaluations as shown in later sections.
3.2. Simulated Annealing
Simulated annealing is a probabilistic search algorithm inspired from the process of annealing of metals in order to obtain the desired properties and microstructure. General steps followed for optimization of parameters using SA algorithm are shown in
Figure 6.
Simulated annealing only considers one solution at a time unlike the genetic algorithm method, which considers a number of chromosomes simultaneously. The fitness of this solution is compared to the fitness of a neighbouring solution. If the fitness of the neighbouring solution is better than that of the current solution, the neighbouring solution is always accepted. However, if the fitness of the neighbouring solution is worse than the current solution, the neighbour may still be accepted with a certain probability. This probability of acceptance of the worse solution depends on the difference in error between the two solutions and the current temperature. The annealing schedule, which is the rate at which the “temperature”, that is stepsize, is reduced in the SA algorithm, plays a critical role in obtaining a global minimum since it has a direct impact on the probability of acceptance of worse solution. A logarithmic cooling scheme, introduced by Geman and Geman, uses the cooling schedule as shown in Equation (2) [
18].
where
T(t) is temperature at time
t,
c and
d are constants.
It has been proven that if
c is greater than or equal to the largest energy barrier of the system, this cooling schedule can lead to the global minimum. However, this may require extremely large number of function evaluations, making its application impractical. Some other cooling schedules that are commonly used in order to overcome this problem are the linear schedule and exponential schedule as shown in Equations (3) and (4) respectively.
where
T is the temperature at time
t,
T0 is temperature at time
(t−1),
η, α are constants.
For the SA algorithm in this case, the exponential cooling schedule was applied since it gave good results within acceptable computation time. The initial temperature used for the algorithm was 1 °C and was reduced using the cooling schedule to 0.01, after which the computation was terminated. This range of temperatures varied the probability of selection of worse solution roughly between 0.99 at high temperatures and 0.01 at low temperatures (depending on the error in the output). Another termination criteria used was that the error evaluated using the error function mentioned in Equation (1) falls below 0.1. When either of the two stopping criteria was met, the best solution obtained till that point was accepted as the solution from SA algorithm. These stopping criteria were used only for the production runs and were removed from the algorithm for trial runs. The stopping criteria on error could be reduced for the production runs, but the computation effort and time significantly increase with reduction in targeted maximum error as shown on a logarithmic scale in
Figure 7. In this case also, every data point on the plot is taken as an average of 50 observations. The SA parameters used for the optimization in this case are mentioned in
Table 5.
Similar to GA, reduction in maximum permissible error increases the number of ANN evaluations following a power law for errors down to 0.02, below which the power law index jumps. Consequently, further error reduction requires greater computational expenditure as seen from the line having steeper slope on the top left in
Figure 7. In order to avoid this high computation effort to optimize, the maximum permissible error for the production runs was limited to 0.1 as mentioned earlier and the solutions obtained from SA were fed into the Nelder-Mead optimization algorithm to further reduce the error to 0.001 within only a few additional ANN evaluations.
3.3. Particle Swarm Optimization
Particle swarm optimization has similar characteristic to genetic algorithms, in that both of them start with assuming a number of solutions within the solution space. However, the methodologies these algorithms use to move towards the optimum solution vary significantly. PSO works primarily on two parameters, particle (solution) position and velocity. During every iteration, a particle is accelerated towards that particle’s best and global best solution obtained till that point. This is done by first updating the particle velocity using Equation (5).
where
Vij(t+1
) is the velocity of particle
i at dimension
j at time
t+1,
Vij(t) is the velocity of particle
i at dimension
j at time
t,
α, β are constants having a value of 2,
r is a random number,
gbest is the global best solution,
pbest is the particle best solution,
pij is the position of particle
i at dimension
j at time
t.
On calculating the updated velocities, the particle positions are updated using the Equation (6).
where,
pij is the position of particle
i at dimension
j,
vij is the velocity of particle
i dimension
j.
The flowchart of the steps followed in PSO is shown in
Figure 8. Similar to all the other algorithms mentioned above, this algorithm also used two different criteria for termination for the production runs. The first criterion was based on the maximum number of iterations, which was limited to 1000 at every temperature for this case, and the other was based on the error function, which was set to 0.1. When either of the two criteria is met, the algorithm terminates and the global best till that point is taken as the solution from PSO. The solutions from this algorithm can also be fed to the NMO method in order to further reduce the error which has been explained in detail later. Both these criteria were removed from the algorithm for the trial runs in order to understand the increase in computation effort required by PSO on lowering the targeted maximum error.
Figure 9 demonstrates the number of function evaluations required by PSO in order to obtain certain minimum error in the output on a logarithmic scale. As seen from the figure, it becomes impractical to reduce the error below a certain point as the computation effort and time increase significantly for no perceivable processing benefit. Similarly to the previous two algorithms, the data points in
Figure 9 are taken as an average of 50 runs each.
On comparing the three algorithms mentioned so far, it was found that at a relatively high targeted error levels, GA required the highest number of ANN evaluations, whereas PSO required minimum evaluations. However, when the targeted maximum error is reduced, GA starts performing better than SA and PSO. In fact, in this case, a targeted maximum error of 0.001 could only be obtained using a GA, although it required close to one million ANN evaluations. This low error level could not be obtained using SA or PSO. This indicates that GA outperforms SA and PSO, which is in agreement to the results obtained by Sathiya, et. al. [
14]. However, if function to optimize is costly to evaluate, one million evaluations to reach near the global minimum can be unacceptable.
3.4. Nelder-Mead Optimization Method
The Nelder-Mead optimization (simplex) method was first developed by John Nelder and Roger Mead in 1965. A simplex consists of n + 1 vertices in an n-dimensional space, each of which represents a potential solution to the optimization problem. The worst solution in every iteration is replaced by a better solution obtained through some operations on the vertices. The steps followed in the application of NMO in this case are shown below:
Initialize the simplex using (n+1) potential solutions, where n is the number of parameters to be optimized.
Define the error function
For each simplex vertex, calculate the error using the error function. Consider that the best solution is vertex B, the worst solution is vertex W and the next worst solution is N with errors EB, EW, EN, respectively.
If EB is less than the desired error, terminate; else follow the next steps
Find the centroid, C, of the best n vertices of the simplex.
Reflect point W through the centroid obtained and calculate the error for the reflected point R.
Further steps depend on the error obtained at the reflected point as follows:
Case 1 ER < EW:
If ER < EB, extend point R to point E by an equivalent distance between C and R. Calculate EE.
If EE < EB, replace W by E, else replace W by R. This extension process can develop skinny simplices which can restrict the ability of NMO to find good search directions. See below for the way the simplex can be re-fattened. Repeat the process from step 4
Case 2 ER >= EW
Calculate EC, the error value at the centroid. If EC < EB, construct a fat simplex about half the size about the centroid. Repeat the process from step 3.
If EC < EN, find the midpoint, M, between C and R and find EM. If EM < EC, replace W by M otherwise replace W by C. Repeat the process from step 4.
If EC > EN, construct a fat simplex about half the size about B. Repeat the process from step 3.
The NMO algorithm has been summarized in the flowchart shown in
Figure 10. In order to simplify the understanding of the algorithm, a schematic with two variables (three vertices) is shown in the
Figure 11.
NMO can be implemented to include constraints, but the process of taking the constraints into account usually makes the algorithm inefficient. Thus, NMO is almost invariably implemented as an unconstrained optimization algorithm, meaning that there is no control over the search space of the solution. If applied directly to optimize welding parameters, using an initial simplex that spans the feasible regime, the solution search would normally go into infeasible region.
Consequently, the above mentioned three algorithms (GA, SA and PSO) are first applied to find approximate solutions, with targeted maximum error of 0.1, following which the Nelder-Mead algorithm is used to further refine the solution. This in almost all the cases can guarantee that the solution obtained is inside the solution space. Finding an approximate solution initially also reduces the number of iterations within NMO required for obtaining significantly low errors.
The vertices (solutions) required to start the algorithm can be obtained through any of the above mentioned algorithms.
Figure 12 shows the reduction in error on application of Nelder-Mead optimization to vertices obtained from GA. Since the stopping criteria error used for this GA optimization was relatively high (~0.1), only a few GA iterations were required to generate these initial vertices. This led to a very small computation effort and time. The number of ANN evaluations required by the GA to form a simplex of 7 vertices for 6 different trials is shown in
Table 6. On an average 119 ANN evaluations were required for obtaining of each vertex. For the Simplex method, the stopping criterion was that the error at the best vertex falls below 0.001, which on an average required another 51 evaluations over and above the ones required by GA. The average number of total ANN evaluations required by the combined GA and NMO algorithm was 884 as seen in
Table 6. If only GA was used to obtain this level of error, the number of evaluations required would be close to one million as mentioned previously. Thus, application of Nelder-Mead optimization in combination with GA can significantly reduce the computation effort and time.
For every trial in
Figure 12, each vertex was obtained by running the GA once to obtain an error below 0.1 and taking the elite chromosome (solution) as one of the vertices. Consequently, for every trial, the GA has been run 7 times.
It can be similarly shown that the number of ANN evaluations for optimization can also be significantly reduced by applying NMO method on the approximate solutions obtained from SA.
Figure 13 shows the reduction in the error in outputs when the same NMO algorithm was applied to the approximate solutions obtained from SA. In this case also, each vertex for every trial is obtained by evaluating the SA algorithm once. On an average 151 ANN evaluations were required by SA to reduce the error to 0.1 and form each vertex of the simplex. Additionally, NMO required another 48 ANN evaluations in order to reduce the error below 0.001 as shown in
Table 7. As mentioned earlier, this low level of error could not be obtained if only SA was used with the existing parameters such as the cooling schedule and decay rate. The average total number of ANN evaluations required for the combined SA and NMO to obtain the error below 0.001 is 1105; consequently, proving the efficiency of NMO.
Nelder-Mead optimization is equally effective on the approximate solutions obtained from PSO as shown in
Figure 14. The average number of total ANN evaluations required by the combined PSO and NMO algorithm to reduce the error below 0.001 was 317 as shown in
Table 8. This value is lower than that required by the GA+NMO and SA+NMO mainly due to the fact that PSO requires fewer ANN evaluations to develop the initial simplex compared to GA and SA. This low level of error could not be obtained on using only PSO as previously mentioned.
Figure 15 shows the number of ANN evaluations required to obtain certain levels of maximum error using different algorithms for a randomly chosen desired outputs of the welding process. The significant drop in the number of evaluations required when NMO is used along with any other algorithm makes the application of such a combined system extremely practical and easily applicable in solving optimization problems.
4. Conclusions
From the experimental data obtained and the computational models developed, the following can be concluded:
A number of evolutionary algorithms can be used for optimization of weld bead geometry of a TIG welding process using a filler material. However, the computation effort and time required by these algorithms to achieve the desired error can make the use of these algorithms impractical.
When GA, SA and PSO are compared at a sufficiently high targeted maximum error, the number of function evaluations required by PSO to find a solution is minimum. However, when the targeted error is reduced, GA proves to be more efficient, requiring fewer ANN evaluations as compared to the other two algorithms.
In all the algorithms mentioned above, as the targeted error is reduced, the number of function evaluations required increases following a power law initially until a certain error is reached, after which the power law index jumps increasing the number of function evaluations to increase dramatically.
On cheaply obtaining approximate solutions that don’t violate constraints from any of the above-mentioned algorithms, the Nelder-Mead (Simplex) optimization method can be applied to the solutions to further reduce the error significantly within a very few additional evaluations.
Thus, this hybrid optimisation method, using algorithms that can easily take account of constraints, and which, because of their nature, are effective at finding the general region in parameter space wherein the global optimum resides relatively cheaply, and using the remarkably efficient Nelder-Mead algorithm to home in on the precise global optimum, without violating physical parameter constraints, shows a high level of robustness, combined with great efficiency.