In this section, numerical experiments are conducted to evaluate the overall performance of the proposed IICNSGA-III algorithm. All algorithms are implemented in Python 3.12.1 and executed on a computer with an Intel Core i7-10510U processor (2.30 GHz) (Lenovo, Beijing, China), running Windows 11.
4.2. Performance Comparison Experiment
The performance of the proposed IICNSGA-III algorithm is evaluated by comparing it with NSGA-III and NSGA-II algorithms. The algorithm parameters are set as follows: population size = 120, maximum number of iterations = 500, crossover probability = 0.9, and mutation probability = 0.1. To ensure a fair comparison, all three algorithms are tested on nine problem instances, and the objective function values obtained are analyzed. Each algorithm is executed 10 times for each problem instance to mitigate the impact of randomness.
Box plots are used to illustrate the objective function values obtained by IICNSGA-III, NSGA-III, and NSGA-II for both small-scale and large-scale instances, providing a performance evaluation of the IICNSGA-III algorithm.
Figure 2 and
Figure 3 present the results for instances 10-15 and 30-15, respectively, comparing the performance of the three NSGA algorithms in solving the multi-objective procurement optimization model. Each subplot contains three box plots, representing the performance of the multi-objective procurement optimization model optimized by the three NSGA algorithms. Each box plot includes four black horizontal lines and one pink horizontal line, which indicate the maximum value, upper quartile, lower quartile, minimum value, and median of the solution set, respectively. Additionally, each subplot features three long dashed lines representing the mean values of the solution sets obtained by the three NSGA algorithms.
Figure 2 and
Figure 3 show that for both small-scale and large-scale instances, IICNSGA-III yields lower mean and minimum values for all four objective functions compared to NSGA-III and NSGA-II. Moreover, as the problem size increases, IICNSGA-III consistently maintains a significant advantage, demonstrating its superior solution quality over NSGA-III and NSGA-II. Specifically, as the number of products and suppliers increases, IICNSGA-III consistently finds lower objective function minima while keeping the upper bound (i.e., maximum value) from increasing significantly. Additionally, the median and mean values of all four objective functions in the solution sets are lower in IICNSGA-III than in NSGA-III and NSGA-II, demonstrating the superior overall performance of the proposed algorithm.
In contrast to the other two algorithms, IICNSGA-III exhibits a slightly more dispersed solution distribution. This is closely linked to the nature of the multi-objective optimization model, which simultaneously considers economic, quality, risk, and environmental objectives. In such cases, the algorithm must navigate trade-offs among competing objectives, where optimizing one may come at the expense of another.
For instance, by minimizing procurement costs, the algorithm may favor lower-cost suppliers, even if they have longer delivery times or lower quality standards. This trade-off can increase total loss and defect rates, reducing robustness in these aspects. Nevertheless, despite these compromises, IICNSGA-III consistently achieves lower mean and minimum values across all objective functions. This demonstrates its ability to effectively balance competing goals, avoiding excessive optimization in a single area while ensuring a well-rounded solution that integrates economic efficiency, quality, risk mitigation, and environmental considerations.
To compare the optimization results of different algorithms, the Relative Percentage Deviation (RPD) is used as a metric to assess solution quality among the three algorithms. The RPD is calculated using the following formula:
where
D(
A) represents the objective function value obtained by algorithm A for problem instance
D, and
Dbest denotes the best objective function value among the results obtained by the three algorithms for the same problem instance
D. Since a lower objective function value corresponds to a more optimal procurement plan, RPD is always greater than or equal to zero. A smaller RPD indicates that the solution is closer to the optimal result, and RPD = 0 signifies that the solution is the best among all computed results. For the nine problem instances, the proposed algorithm is compared with other algorithms based on the relative percentage deviation of the minimum value (MIN), the maximum value (MAX), and the average of the best values (AVG) obtained from 10 independent runs within five sets of trials for each problem instance. The corresponding results are presented in
Table 3.
From
Table 3, it can be observed that in the vast majority of problem instances, the minimum (optimal) and mean Relative Percentage Deviation (RPD) values obtained by the IICNSGA-III algorithm are zero, indicating that the solution quality of IICNSGA-III is superior to that of NSGA-III and NSGA-II.
Specifically, the RPD values of both the best-found solutions and the mean solutions obtained by IICNSGA-III are significantly lower than those of NSGA-III and NSGA-II across all test cases. This improvement is primarily attributed to structural enhancements in IICNSGA-III, particularly the incorporation of repair mechanisms in population initialization, crossover, and mutation operations. By ensuring that the generated solutions consistently satisfy integer constraints and other problem-specific conditions, IICNSGA-III effectively enhances solution feasibility and robustness. This is particularly crucial for the order allocation problem, which involves a large solution space with multiple constraints.
In addition to structural improvements, IICNSGA-III integrates the NSGA-III framework with Pareto Simulated Annealing (PSA), using simulated annealing for local searches based on existing solutions. This approach reduces the risk of local optima, enhances solution diversity, and expands the search space. PSA further broadens the Pareto front by improving exploration, whereas NSGA-III and NSGA-II, lacking a similar global search mechanism, are more prone to premature convergence and stagnation in large-scale problems. As a result, IICNSGA-III achieves higher solution accuracy, better convergence stability, and greater adaptability across different problem scales, delivering superior optimization performance.
The maximum RPD values across all test cases further demonstrate the robustness of IICNSGA-III. In most cases, its maximum RPD remains at zero or within a minimal range, reflecting stable and high-quality performance. Although some instances show relatively higher values, this does not affect its overall advantage, as it still achieves the best minimum and average RPD. These larger values also suggest a wider solution spread, indicating stronger global search ability and a more diverse Pareto front.
Meanwhile, the Mann–Whitney U non-parametric test was employed to assess the differences between the results obtained by the proposed algorithm and each compared algorithm across nine test cases, with four objective function values per case, totaling 36 results. The test results are presented in
Table 4. The significance level was set at
α = 0.05, with the null hypothesis stating that there is no difference between the two groups. If the
p-value is greater than or equal to 0.05, the null hypothesis is retained, indicating no statistically significant difference between the two datasets. Conversely, if the
p-value is below 0.05, a statistically significant difference is confirmed.
From
Table 4, it can be observed that IICNSGA-III exhibits significant differences from each compared algorithm in the vast majority of test cases. Combined with the data in
Table 3, it is evident that the proposed algorithm demonstrates a clear advantage over all compared algorithms.
To further analyze the performance of the three algorithms across different scales, the Hypervolume (HV) indicator and the Inverted Generational Distance (IGD) are employed. The IGD value represents the volume of the region in the objective space formed by the non-dominated solution set obtained by the algorithm and the reference point. The IGD value is used to measure the discrepancy between the obtained solution set and the true Pareto front. Thus, a higher HV value and a lower IGD value indicate better overall algorithmic performance. The reference point for the HV calculation is set to a point slightly larger than all objective values, while the Pareto reference front for the IGD calculation consists of the optimal solutions for each objective function. The HV and IGD values for IICNSGA-III, NSGA-III, and NSGA-II across the nine test cases are shown in
Figure 4 and
Figure 5. As illustrated in
Figure 4 and
Figure 5, IICNSGA-III exhibits significantly superior performance over both NSGA-III and NSGA-II across the nine test cases of varying scales.
All HV values are expressed in units of
. As shown in
Figure 4, IICNSGA-III consistently achieves the highest HV values across all problem instances, with its advantage becoming more pronounced as the problem scale increases. In small-scale problems such as 10-5, IICNSGA-III reaches 0.03, three times higher than NSGA-III at 0.01. In 10-10, IICNSGA-III increases to 0.09, while NSGA-III and NSGA-II achieve only 0.03 and 0.02, respectively. The performance gap remains moderate in this range, but IICNSGA-III maintains a consistent lead.
As the problem size increases, IICNSGA-III demonstrates a stronger advantage. In 20-15, its HV value rises to 1.09, significantly outperforming NSGA-III at 0.23 and NSGA-II at 0.25. The trend becomes even more pronounced in large-scale problems. In 30-10, IICNSGA-III reaches 2.78, surpassing NSGA-III by more than six times, as NSGA-III records 0.42, and exceeding NSGA-II, which reaches 0.70. Similarly, in 30-15, IICNSGA-III achieves 1.83, far ahead of NSGA-III at 0.47 and NSGA-II at 0.51.
All IGD values are expressed in units of
. As shown in
Figure 5, IICNSGA-III consistently achieves the lowest IGD values across all problem instances, indicating superior convergence towards the true Pareto front. In small-scale problems, IICNSGA-III achieves 1.80 in 10-5, which is lower than NSGA-III at 1.94 and NSGA-II at 1.95. In 10-10, IICNSGA-III maintains a lower IGD at 2.06, while NSGA-III and NSGA-II record 2.50 and 2.35, respectively.
As the problem scale increases, the performance gap widens. In 20-15, IICNSGA-III achieves 3.58, significantly lower than NSGA-III at 4.15 and NSGA-II at 4.20. The trend continues in 30-10, where IICNSGA-III reaches 3.65, outperforming NSGA-III at 4.98 and NSGA-II at 4.93. For large-scale problems, IICNSGA-III demonstrates the most substantial improvement. In 30-15, it maintains the lowest IGD value at 3.97, while NSGA-III and NSGA-II record 4.94 and 5.09, respectively.
The HV and IGD metrics evaluate multi-objective optimization performance. HV provides a broad set of trade-off solutions, helping decision-makers balance conflicting objectives such as cost reduction and carbon emission control. In high-dimensional optimization with four or more objectives, traditional algorithms often fail to explore the objective space effectively. IGD measures the discrepancy between the obtained solution set and the true Pareto front, where a lower value indicates better convergence and a more reliable solution set. Reducing IGD prevents premature convergence and enhances global search. IICNSGA-III improves both HV and IGD, offering a well-distributed solution set with superior convergence. Through enhanced global search strategies, it generates solutions closer to the global optimum, supporting more precise decision-making under complex constraints.
NSGA-III and NSGA-II perform poorly in large-scale problems, especially as the number of suppliers increases. The HV value increases slightly, while the IGD value rises rapidly, indicating a tendency for early convergence and difficulty in finding the global optimum. IICNSGA-III shows clear advantages in accuracy and solution quality, particularly on complex datasets. It offers decision-makers more trade-off options among multiple objectives. The algorithm also performs well in high-dimensional optimization and constraint handling, generating a broader and higher-quality Pareto front. IICNSGA-III is well suited for multi-objective optimization with four or more objectives and multi-constraint problems, particularly where high-quality trade-off solutions are needed.
4.3. Ablation Experiment
This study introduces four mechanisms: heuristic population initialization, infeasible Solution Repair, improved crossover and mutation strategies, and Pareto simulated annealing. To examine their impact on the overall algorithm’s performance, each mechanism was replaced with a traditional counterpart, resulting in four alternative algorithms. Specifically, the algorithm obtained by replacing heuristic population initialization with random population generation is referred to as non-Heuristic Population Initialization (non-HPI). The algorithm derived by modifying the infeasible solution optimization and repair mechanism to perform only basic repairs without additional optimization steps is denoted as Infeasible Solution Repair (ISR). The algorithm that replaces the improved crossover strategy with simulated binary crossover and the improved mutation strategy with polynomial mutation is termed Simulated Binary Crossover-Polynomial Mutation (SBX-PM). Finally, the algorithm obtained by removing Pareto simulated annealing is referred to as non-Pareto Simulated Annealing (non-PSA).
These four algorithms were executed 10 times across nine test cases, using procurement cost as the evaluation benchmark. The relative percentage change (RPC) in the optimal solution values was analyzed to compare the performance of these algorithms with traditional mechanisms against the proposed algorithm. The specific calculation formula is presented in Equation (13).
Here, D(X) represents the results obtained in problem instance D when the improved mechanism is replaced with the traditional mechanism. the results obtained in problem instance D when running the proposed algorithm. A larger RPC value indicates that the mechanism has a more significant impact on the algorithm’s performance, leading to better optimization of procurement costs.
From the results in
Figure 6, IICNSGA-III consistently achieves superior solutions across all problem instances compared to non-HPI, ISR, SBX-PM, and non-PSA. The improved crossover and mutation strategy performs best among the four mechanisms, achieving the highest RPC values across all test cases, with 5.98% in 10-15, 5.57% in 20-15, and 2.95% in 30-10. This indicates that the weight-matrix-based crossover operator and multi-column exchange mutation operator enhance the algorithm’s global search capability, effectively preventing premature convergence to local optima. The Pareto simulated annealing mechanism ranks second, with RPC values increasing with problem size. In 30-10 and 30-15, the values reach 1.03% and 1.68%, respectively, demonstrating that this mechanism enhances exploration and improves stability in complex optimization problems. The heuristic population initialization mechanism also demonstrates adaptability, particularly in large-scale instances. In 20-15, its RPC value reaches 3.08%, exceeding non-PSA and ISR, suggesting that a well-structured initialization strategy enables the algorithm to enter high-quality solution spaces more efficiently, improving optimization performance. The infeasible solution repair mechanism (ISR) shows relatively minor improvements but remains effective in certain cases. The RPC values are 2.79% in 10-10 and 2.18% in 10-5, indicating that repairing infeasible solutions during the search process helps reduce interference from invalid solutions and stabilizes the search direction.
The four improved mechanisms proposed in this study enhance the algorithm’s performance, with the crossover and mutation strategies delivering the best results.
In addition, to assess whether these four mechanisms improve the proposed algorithm’s ability to explore a broader objective space, the HV values of IICNSGA-III, non-HPI, ISR, SBX-PM, and non-PSA were calculated across nine test cases, as shown in
Figure 7. Across all problem scales, the HV value of IICNSGA-III remains consistently the highest, indicating that all four mechanisms effectively enhance the algorithm’s global optimization capability.