1. Introduction
Over the past decades, surrogate-assisted evolutionary algorithms (SAEAs), a prominent class of evolutionary computation (EC) algorithms, have demonstrated considerable success in solving complex optimization problems through the use of surrogate models [
1,
2,
3,
4]. Traditional evolutionary algorithms (EAs) rely on fitness evaluations (FEs) to identify good individuals for evolutionary search. However, when FEs are limited or expensive to obtain, the performance of EAs degrades significantly [
5]. This limitation is common in many real-world scenarios, often known as expensive optimization problems (EOPs), where each FE incurs high computational or financial cost [
6,
7,
8]. To address this, SAEAs leverage surrogate models trained on previously evaluated solutions to approximate FEs, enabling efficient search under limited evaluation budgets. In many practical settings, particularly those constrained by time, computational resources, or data access, conducting new FEs during optimization is impractical or impossible [
9,
10,
11]. In such contexts, offline SAEAs are especially valuable, as they construct surrogates solely from existing data and rely entirely on surrogate-assisted evaluation throughout the optimization process. As a result, SAEAs provide a cost-effective and practical alternative to conventional EAs for EOPs.
Although SAEAs have shown progress in solving EOPs, the accuracy of built surrogates will have a great influence on the optimization results. Therefore, how to build an accurate surrogate for solving EOP is a crucial question in designing SAEAs. To date, research on improving SAEAs has primarily followed two major directions. The first focuses on enhancing the quality and quantity of available data, as these factors critically influence the accuracy and reliability of surrogate models. For example, the presence of noise in the data can significantly impair the surrogate’s predictive performance, making data preprocessing an essential step [
5]. When handling complex data structures, advanced learning techniques can be employed to uncover latent patterns and improve model robustness. One of the most persistent challenges in SAEAs is the limited size of the evaluated dataset. In general, larger datasets lead to more accurate and generalizable surrogate models [
12,
13]. Accordingly, extensive research has been dedicated to increasing data availability and making better use of existing data. Representative approaches include local smoothing techniques and synthetic data generation strategies [
14,
15], which aim to augment the dataset and enrich the learning process.
The second major research direction aims to construct more accurate surrogate models and/or develop effective surrogate model management (SMM) strategies to govern their use. Several surrogate modeling techniques have been explored, including Kriging [
16], random forests [
17,
18], and radial basis function neural networks (RBFNNs) [
19]. Moreover, ensemble learning has become a powerful tool to improve surrogate prediction by aggregating multiple base models [
20,
21]. In parallel, SMM plays a crucial role in surrogate-assisted evolutionary optimization (SAEO), particularly in scenarios with limited data. SMMs are designed to dynamically manage model usage across the optimization process, including strategies for sample selection [
22,
23] and knowledge transfer between related tasks [
24,
25,
26]. Moreover, different learning paradigm has also been studied, such as using contrastive learning [
27], federated learning [
28], symbolic regression [
29], dimension reduction [
30], and autoencoder embedding [
31]. Collectively, these strategies aim to maximize data utility, mitigate overfitting, and enhance the optimization performance of SAEAs under FE-constrained conditions.
Despite numerous advancements, balancing prediction accuracy and generalization ability remains a critical challenge in the design of SAEAs, which restricts their applicability in solving EOPs. To address this, a novel probability selection-based SAEA (PS-SAEA) is proposed in this paper to more effectively tackle EOPs. Specifically, a probabilistic model selection (PMS) strategy is introduced to select promising surrogate models in a stochastic manner, thereby avoiding the overfitting commonly caused by greedy selection mechanisms. In addition, a weighted model ensemble (WME) method is developed to integrate the selected models, with weighting determined by each model’s prediction error, to produce more accurate fitness estimations.
The main contributions of this paper can be summarized as follows:
- (1)
A PMS strategy is proposed to select models for ensemble prediction by considering both prediction accuracy and selection probability, thereby achieving a better trade-off between surrogate accuracy and generalization ability.
- (2)
A WME mechanism is introduced to combine the selected models into an ensemble, with each model weighted according to its prediction error, so as to enhance the overall reliability and accuracy of fitness approximation.
- (3)
By integrating PMS and WME, a new algorithm, PS-SAEA, is developed for solving EOPs more efficiently, which offers a promising approach for complex optimization tasks.
Comprehensive experiments are conducted on well-established benchmark problems with varying dimensionalities. The results show that the PS-SAEA significantly outperforms cutting-edge SAEAs across different scenarios. This confirms the effectiveness and robustness of PS-SAEA in handling EOPs.
The remainder of this paper is:
Section 2 provides a brief overview of SAEAs and reviews related work.
Section 3 details the proposed PS-SAEA, including its core components and algorithmic framework.
Section 4 offers the experimental setup, benchmark problems, and evaluation. Finally,
Section 5 concludes the paper.
4. Experimental Studies
4.1. Experiment Setup
In this study, five widely recognized problems are employed to comprehensively assess the performance of the proposed PS-SAEA. The selected test functions, as shown in
Table 1, denoted as T1 through T5, are frequently used in the SAEA literature due to their diverse landscape characteristics and varying degrees of complexity. Specifically, P1 (Ellipsoid) is a unimodal function designed to assess an algorithm’s ability to converge to the global optimum efficiently. In contrast, T2 (Rosenbrock), T3 (Ackley), T4 (Griewank), and T5 (Rastrigin) are multimodal functions that present significant challenges for SAEAs due to their numerous local optima. Together, these functions span a broad range of search landscapes, enabling a robust evaluation of the PS-SAEA’s exploration and exploitation capabilities.
In line with standard practices, each test function is evaluated under four different dimensional settings: D = {10, 30, 50, 100}, where D denotes the problem dimension. All test functions are configured such that the global optimum is known and fixed at zero, facilitating a fair and consistent performance comparison.
To ensure equitable and reproducible comparisons, the experimental setup is as follows. First, Latin hypercube sampling [
36] is employed to generate 11 ×
D data points across the entire search space, thereby forming the initial real-evaluation dataset for each SAEA. Each data point is a numerical vector that represents the corresponding candidate solution. Based on this dataset, each SAEA constructs surrogate models to guide the evolutionary search toward optimal solutions. Importantly, under this offline data-driven setting, no algorithm is allowed to perform real fitness evaluations beyond the initial 11 ×
D samples. This constraint reflects realistic conditions where only limited evaluation resources are available. All the experiments are conducted using the MATLAB (R2023a) software tool. The experimental environment utilizes a computer server with two CPUs, specifically Intel (from Dell, Beijing, China) (R) Xeon(R) W5-3423, and 256 GB of RAM.
To mitigate statistical variability, each algorithm is carried out 25 times for each problem instance independently, and the average results are reported. For significance testing, the Wilcoxon rank-sum test is conducted with a significance level of α = 0.05. For clarity in result interpretation, three symbols are used to represent comparative outcomes: “+” indicates that PS-SAEA performs significantly better than the compared algorithm; “≈” denotes statistically equivalent performance; and “−“ indicates significantly inferior performance of PS-SAEA. These symbols provide an intuitive summary of the experimental results, as presented in the following sections.
4.2. Compared Advanced Algorithms
To thoroughly evaluate and challenge the effectiveness of the proposed PS-SAEA, four state-of-the-art SAEAs are selected for comparison: SAEA-SE [
21], BSAEA [
15], SAEA-PES [
14], and CL-SAEA [
27]. These algorithms are all representative SAEAs that employ multiple surrogate models for ensemble prediction and have obtained promising results, each adopting a distinct strategy for surrogate selection. As such, they provide an ideal baseline for assessing the performance of PS-SAEA, which introduces the novel PMS and WME for ensemble model selection.
To ensure fairness and consistency in the comparative study, all competing algorithms are implemented using their official or publicly available codebases, thereby eliminating discrepancies that may arise from implementation differences. Furthermore, in the proposed PS-SAEA, the same evolutionary operators as those utilized in the compared algorithms are adopted. That is, both the PS-SAEA and the compared SAEAs use the simulated binary crossover operator [
37] and the polynomial mutation operator [
38]. The parameter settings are kept consistent with those used in the compared DDEAs. Specifically, the crossover probability is set to 100%, while the mutation probability is defined as 1/
D, where
D denotes the problem dimensionality. The evolutionary process terminates after 500 generations, which serves as the termination criterion. This design choice ensures that any observed performance improvements can be explicitly attributed to the proposed PMS and WME, rather than to differences in the underlying evolutionary mechanisms. In addition, the number of available models in these algorithms is set to 2000.
4.3. Comparison Study with SAEAs
The results presented in
Table 2 compare the proposed PS-SAEA with four representative SAEA variants: SAEA-SE, BSAEA, SAEA-PES, and CL-SAEA, across four scenarios (
D = 10, 30, 50, 100) and five test problems (T1–T5) for each. To provide a better illustration,
Figure 2 plots the number of better, similar, and worse results obtained by the PS-SAEA when compared with different SAEAs.
In the low-dimensional setting (D = 10), PS-SAEA demonstrates a competitive performance. It outperforms SAEA-SE in T2, but performs equivalently in the remaining tasks. Compared to BSAEA and SAEA-PES, PS-SAEA achieves better performance in two tasks and comparable results in the rest, indicating that while the performance gap is not large, PS-SAEA maintains robustness and consistency. Compared to the CL-SAEA, the proposed PS-SAEA obtains significantly better results on 2 problems and similar results on 1 problem. Notably, all compared methods exhibit similar levels of variance, which suggests the differences lie primarily in convergence behavior rather than instability.
When D = 30 (medium-dimensional cases), the differences become more pronounced. PS-SAEA outperforms BSAEA and SAEA-PES in all five problems and achieves two wins against SAEA-SE and CL-SAEA. The advantage is particularly evident in T1 and T5, where the gaps in mean performance are substantial. This highlights PS-SAEA’s strong search ability. The consistent superiority over BSAEA and SAEA-PES demonstrates better adaptability to landscape changes and more efficient surrogate model usage.
When D = 50, as problem complexity increases, the performance gap continues to widen. PS-SAEA significantly outperforms all three SAEAs in almost all test problems. Especially for T1, T3, T4, and T5, PS-SAEA shows much lower mean errors and tighter standard deviations. In this setting, BSAEA and SAEA-PES often suffer from higher variance and degraded accuracy, while PS-SAEA maintains stable convergence. These results validate the effectiveness of the PMS and WME used in PS-SAEA, which helps balance accuracy and generalization ability even in higher dimensions.
When D = 100 (high-dimensional cases), PS-SAEA exhibits superior overall performance, with five significantly better results over SAEA-SE and SAEA-PES, three significantly better and two similar performances against CL-SAEA, and two significantly better and three similar performances against BSAEA. The most notable gap occurs in T1 and T2, where SAEA-SE and SAEA-PES produce very large mean errors and variances, likely due to model misguidance or premature convergence, especially in higher-dimensional problems with multiple local optima. In contrast, PS-SAEA maintains the prediction-driven search focus and avoids local optima, indicating the excellent scalability and robustness of the method.
To better evaluate the effectiveness of the surrogate,
Table 3 gives the comparisons between the PS-SAEA and other SAEAs in terms of mean absolute error (MAE). Overall, PS-SAEA achieves the lowest or near-lowest MAE in most cases, demonstrating its superior surrogate accuracy and generalization ability. In the low-dimensional scenario (
D = 10), PS-SAEA exhibits the best prediction accuracy in T2 and T3, and remains highly competitive in other tasks, implying that its model ensemble can effectively capture local landscape features. When the dimension increases to
D = 30, PS-SAEA consistently yields lower MAE values than SAEA-SE, BSAEA, and SAEA-PES, particularly in T1 and T5, showing that the proposed prediction-guided mechanism enhances model fidelity even with limited evaluations. Notably, CL-SAEA occasionally achieves low MAEs (e.g., T5), but its performance is unstable across tasks, indicating overfitting or excessive reliance on specific samples. At
D = 50, PS-SAEA continues to deliver stable and accurate surrogate predictions, maintaining a clear advantage over other SAEAs in most test problems. The reduced errors on T1–T3 demonstrate that the PMS and WME modules contribute to the construction of diverse yet reliable models. Finally, in the high-dimensional setting (
D = 100), PS-SAEA markedly outperforms all baselines on T1–T3 and T5, achieving up to 80% lower MAE than SAEA-SE and SAEA-PES. These results confirm that PS-SAEA not only maintains search robustness but also significantly improves model accuracy, which directly contributes to its superior optimization performance in high-dimensional spaces.
From the above, PS-SAEA achieves no losses across all test scenarios, consistently delivering equal or superior performance, and especially outperforms the other SAEAs in mid-to-high dimensional problems, validating the effectiveness of the PMS and WME. These results demonstrate that PS-SAEA not only improves optimization efficiency but also enhances generalization across diverse problem scales, making it a robust and promising approach for EOPs.
4.4. Component Analysis of PS-SAEA
This part analyzes the PMS and WME of PS-SAEA. To do this, the PS-SAEA is compared with its variant not using PMS or WME. For simplicity, these two variants are denoted as PS-SAEA-P and PS-SAEA-W, respectively. In the PS-SAEA-P, the models are selected based on their prediction error on evaluated data in a greedy manner. In the PS-SAEA-W, the WME is removed, and the models are combined on average.
Table 4 presents the performance comparisons among the original PS-SAEA and its two variants, PS-SAEA-P and PS-SAEA-W. To provide a better visualization,
Figure 3 plots the average optimization results of different PS-SAEA variants. From the results, several important observations can be made regarding the contributions of the PMS and WME components in the PS-SAEA.
Across all problem dimensions, the original PS-SAEA consistently performs better than PS-SAEA-P. Specifically, PS-SAEA outperforms PS-SAEA-P in 15 out of 20 test cases, and shows similar performance in the remaining 5 cases. These results highlight the effectiveness of the PMS method, which introduces stochasticity into the model selection process. This helps avoid overfitting caused by deterministic greedy selection strategies, especially when training data are limited or noisy.
When compared with PS-SAEA-W, which removes the WME and instead averages the selected models for prediction, the original PS-SAEA still achieves better or comparable performance in all cases. Notably, PS-SAEA outperforms PS-SAEA-W in 2 cases, performs similarly in 18 cases, and does not show significantly worse results. This suggests that while the WME component offers modest but consistent gains, it contributes to the robustness and overall predictive accuracy of the ensemble, particularly under complex or high-dimensional scenarios. This may be because that the WME assigns higher weights to the more accurate surrogate models and lower weights to the less accurate models, thereby yielding a final prediction with reduced error.
In summary, the results demonstrate that both the PMS and WME contribute positively to the effectiveness of PS-SAEA. Their integration enables the PS-SAEA to strike a better balance between accuracy and generalization, thereby enhancing its ability to solve EOPs more reliably.
4.5. Parameter Study of PS-SAEA
To study the impact and sensitivity of the parameter r in PS-SAEA, this paper conducts experimental comparisons among the PS-SAEA variants using different values of r. Specifically, the original PS-SAEA using r = 7.5% is compared with the variants using r = 2.5%, r = 5%, r = 10%, and r = 15%. For simplicity, these variants are referred as PS-SAEA (r = 7.5%), PS-SAEA (r = 2.5%), PS-SAEA (r = 5%), PS-SAEA (r = 10%), and PS-SAEA (r = 15%), respectively.
The results presented in
Table 5 analyze the sensitivity of the PS-SAEA with respect to different values of the selection ratio
r, which determines the proportion of elite models retained for the ensemble. To provide a better visualization,
Figure 4 plots the average optimization results of PS-SAEA variants with different
r.
For D = 10, PS-SAEA (r = 7.5%) generally shows competitive or superior performance, outperforming the r = 2.5% variant once and performing comparably in the remaining four cases. Notably, it achieves a significantly better result on T2 compared with the r = 2.5% variant. Interestingly, r = 5% and r = 10% perform equally well in all cases, even achieving better results on T5. However, both these configurations show one statistically inferior case compared to the default. This suggests that PS-SAEA is relatively robust in this low-dimensional environment, though r = 5% to 10% may offer slight performance advantages in some problems.
For D = 30, the comparisons show more divergence. PS-SAEA (r = 7.5%) performs slightly worse than its variants in several cases. Specifically, r = 5% and r = 10% outperform it in T2 and T3, while r = 2.5% and r = 15% lead to significantly worse outcomes in T1 and T4, respectively. However, PS-SAEA (r = 7.5%) remains consistently within the top-performing group, without being outperformed in any scenario. These results imply that moderate values of r (5–10%) are better suited for balancing exploration and surrogate accuracy in different scenarios.
For D = 50, the differences among variants are subtler. All PS-SAEAs achieve statistically equivalent results in most cases, with only a single statistically worse result for r = 15% (T4). This indicates that the algorithm becomes less sensitive to the value of r as the problem dimension increases. Nevertheless, PS-SAEA (r = 7.5%) remains stable and effective across all test cases.
For D = 100, with high-dimensional characteristics, the results become more varied again. The default r = 7.5% achieves the best or equivalent results in all tasks, particularly outperforming r = 5% and r = 10% in T2 and T5. Both of these variants exhibit significantly worse performance on T2 and T5, indicating possible overfitting or model bias when the selection ratio is too high in complex landscapes. Conversely, r = 2.5% and r = 15% remain statistically similar to the baseline but do not exhibit clear advantages.
The parameter study reveals that r = 7.5% consistently achieves stable and superior or comparable performance across a diverse set of benchmark problems and different dimensional settings. Specifically, in low to high-dimensional scenarios (e.g., D = 10, 30, 50, 100), this ratio balances the trade-off between exploration and exploitation effectively. The results indicate that smaller ratios (e.g., 2.5%) may lead to insufficient model diversity, hampering the surrogate ensemble’s robustness, while larger ratios (e.g., 10–15%) tend to cause overfitting or excessive focus on a limited subset of models, reducing generalization.
As analyzed above, the PS-SAEA is not sensitive to the r value, while excessively large or small values may decrease the performance of both the ensemble model and the PS-SAEA. The r = 7.5% can achieve a good balance between the accuracy and generalization ability, thereby recommended in this paper.
4.6. Computational Cost of PS-SAEA
To investigate the computational cost of the PS-SAEA, the average running time (in seconds) is recorded and compared with three state-of-the-art SAEA variants: SAEA-SE, BSAEA, and SAEA-PES. The experiments are conducted on two representative test functions, T1 (Ellipsoid, unimodal) and T5 (Rastrigin, multimodal), with four different dimensions: 10, 30, 50, and 100. These test instances are selected to evaluate the time efficiency of PS-SAEA in handling problems with varying landscapes and dimensionalities. The detailed comparisons are provided in
Table 6.
From the results, it can be observed that the computational cost of PS-SAEA is generally comparable to or slightly higher than that of other SAEAs in low-dimensional cases (e.g., D = 10 or 30). For instance, on T1 with D = 10, PS-SAEA takes 0.172 s on average per generation, which is close to SAEA-SE (0.187 s), BSAEA (0.169 s), and SAEA-PES (0.177 s). Similar trends can be found for T5 at low dimensions, indicating that the additional procedures introduced by PS-SAEA (i.e., probabilistic model selection and weighted ensemble) incur only a modest overhead in simpler scenarios.
However, as the problem dimension increases, the time cost of PS-SAEA rises more significantly compared to BSAEA. For example, on T1 with D = 100, PS-SAEA requires 11.654 s, while BSAEA takes only 4.826 s. This is primarily due to the extra computational burden associated with maintaining and selecting among multiple surrogate models, as well as performing weighted model ensemble operations. Nevertheless, when compared with SAEA-SE and SAEA-PES, both of which also incorporate complex model management strategies, the time cost of PS-SAEA remains on par or even lower in certain cases (e.g., T5 at D = 50).
Despite the increased time cost at higher dimensions, it is important to emphasize that PS-SAEA achieves a significantly better optimization performance, as demonstrated in previous sections. Thus, the trade-off between computational cost and solution quality is justifiable, particularly for EOPs where function evaluations are dominant over surrogate model operations.
In summary, although PS-SAEA introduces slight additional overhead due to model selection and ensemble procedures, its time complexity remains acceptable and comparable to existing advanced SAEAs, especially considering its substantial performance improvements. Therefore, PS-SAEA can be considered an effective and practical choice for solving both unimodal and multimodal expensive optimization problems across a wide range of dimensionalities.
4.7. Scalability with Parallel Processing
This part studies the scalable ability of PS-SAEA with parallel processing. As the main time costs are from building the model and making the model predictions, we parallelized the model generation and prediction process to test the time costs of PS-SAEAs. To analyze its ability with different numbers of threads (denoted as
TR), the time costs of using different TRs are reported. The results on T1 and T5 are provided in
Table 7.
From
Table 7, it is evident that PS-SAEA benefits significantly from parallelization. When the
TR increases from 1 to 12, the time cost decreases almost linearly across both test cases (T1 and T5). For example, on T1 with
D = 100, the execution time drops from 1398.481 s with a single thread to 125.131 s with 12 threads, yielding an approximate 11.2× speedup. A similar trend is observed on T5, where the time reduces from 1437.782 s to 128.445 s under the same configuration, corresponding to an 11.2× acceleration. These results confirm that the proposed PS-SAEA framework scales effectively with the number of computing threads.
Another important observation is that the scalability is more pronounced for higher-dimensional problems. While parallelization reduces computation time consistently across all settings, the absolute savings are far greater for large-scale dimensions. For instance, in T1 with D = 10, parallelization to 12 threads saves about 18.8 s, whereas for D = 100, it reduces the runtime by over 1270 s. This highlights the suitability of PS-SAEA for large-scale optimization tasks, where model construction and prediction become the dominant computational bottlenecks.
In summary, the results verify that parallelizing surrogate training and prediction enables PS-SAEA to maintain both efficiency and scalability. The near-linear speedup with increasing TR demonstrates that the framework can effectively exploit modern multi-core computing environments, thus making it practical for real-world high-dimensional optimization problems.
4.8. Comparisons with Basic EAs Without Surrogate-Assisted Method
To further validate the effectiveness of the proposed PS-SAEA, we compared it with Basic EAs (BEAs) that do not incorporate surrogates. Both PS-SAEA and BEA employ identical evolutionary operators; the only difference lies in the evaluation process: PS-SAEA replaces expensive real FEs with surrogate predictions, whereas BEA evaluates individuals through real FEs. To provide a fair comparison, BEAs with three different amounts of evaluated data were tested: 11 × D, 55 × D, and 110 × D FEs, where D is the problem dimension.
The results in
Table 8 show that when the BEA has the same number of real FEs as the PS-SAEA, the PS-SAEA consistently and significantly outperforms the BEA across all dimensions, achieving 5/0/0 significantly better/similar/significantly worse results in all cases. Moreover, the results plotted in
Figure 5 also show that the PS-SAEA extensively outperforms the BEA across all dimensions. This demonstrates the substantial advantage of surrogate-assisted evaluation under severely limited evaluation budgets.
When BEA is granted more real evaluations (55 × D and 110 × D), its performance improves, occasionally surpassing PS-SAEA on certain tasks (e.g., T2–T5 in 10D, T5 in higher dimensions). However, PS-SAEA still matches or exceeds BEA in most scenarios, despite using no additional real FEs. That is, the PS-SAEA can use about 10% budget to obtain similar or even better performance than the BEA. This suggests that the proposed surrogate-assisted approach can achieve competitive or superior results with significantly fewer expensive evaluations, thereby offering substantial cost savings.
Overall, these results confirm that PS-SAEA is not only highly effective under limited evaluation conditions but also remains competitive even when BEAs are allowed up to ten times more real evaluations.
4.9. Discussions
Results and Evidence: The experimental results demonstrate that the proposed PS-SAEA consistently outperforms or matches the performance of baseline SAEAs across multiple benchmark problems, especially under limited evaluation budgets. For instance,
Table 2 shows the great performance of PS-SAEA compared to existing SAEAs.
Table 7 and
Table 8 show that PS-SAEA achieves significantly better results than BEAs with similar evaluation resources, and maintains competitive performance even when BEAs are allotted several times more evaluations. This empirical evidence supports the effectiveness of the surrogate-assisted approach and underscores its potential for cost-efficient optimization. Furthermore, ablation studies (e.g.,
Section 4.8) indicate that components (PMS and WME) notably influence performance, with statistically significant differences confirmed through the applied tests. These findings provide concrete support for the claim that the integrated surrogate management strategies enhance robustness and accuracy.
Theorizing: From a theoretical perspective, the results corroborate the notion that managing model overfitting is crucial in surrogate-assisted optimization, especially with limited data. The success of the PS-SAEA’s PMS and ensemble strategies can be interpreted as mechanisms that balance exploration and exploitation, thereby improving generalization. The stochastic nature of the PMS method prevents the algorithm from prematurely converging due to over-reliance on a single surrogate model, aligning with ensemble learning theories that advocate for diversity and robustness.
Generalization ability: Although the paper primarily utilizes RBFNNs due to their efficiency and suitability for high-dimensional problems, the core architecture, particularly the model management strategy involving PMS and WME, can be extended to other surrogate types. For example, Gaussian processes (GPs) are highly popular due to their probabilistic nature, which provides both mean predictions and uncertainty estimates, making them valuable for balancing exploration and exploitation. Integrating GPs into the PS-SAEA framework would involve replacing the RBFNN predictor with a GP model. The PMS scheme could leverage the uncertainty estimates from GPs to assess model accuracy and diversity, potentially enhancing the ensemble’s reliability. Furthermore, since GPs inherently quantify prediction variance, the framework could incorporate uncertainty-based weighting strategies within the WME. Moreover, random forests (RFs) are robust, scalable ensemble models capable of handling noisy and high-dimensional data. Their ensemble structure aligns well with the WME approach. Adapting PS-SAEA to RFs would involve managing multiple RF models or variants within the ensemble, using predictive accuracy and diversity metrics to guide probabilistic selection. In addition, other surrogates (e.g., support vector regression, polynomial regression) can also be integrated with modifications to the model management protocols. For instance, support vector regression models could be incorporated with appropriate diversity metrics, whereas polynomial models can serve as simpler, faster approximants for certain problem landscapes.
Possible extension: While the current study focuses on continuous problems, adaptations can be made to accommodate the discrete or hybrid nature of many real-world applications [
7]. For surrogate model building, surrogates such as classification models (e.g., decision trees, random forests) or specialized kernel methods can be employed to model combinatorial solutions. For mixed-variable domains, hybrid surrogates combining continuous and discrete modeling techniques can be designed, enabling the surrogate to effectively approximate objective functions with mixed types. For the evolutionary operator, the evolutionary components can be tailored to incorporate discrete mutation, crossover, and neighborhood search strategies suitable for combinatorial variables. Incorporating problem-specific operators can enhance search efficiency within the variable space.
4.10. Limitations
Although PS-SAEA has shown promising results in previous experimental studies, two limitations should be noted. First, the performance of PS-SAEA heavily depends on the quality of the evaluated data used for model training. As a result, noisy or incomplete data may significantly degrade the algorithm’s effectiveness. To address this, appropriate data preprocessing or noise-handling techniques should be employed when applying PS-SAEA in such scenarios. To enhance robustness in practice, one should incorporate data preprocessing techniques such as noise filtering, outlier detection, normalization, or data augmentation to improve data quality prior to modeling. Additionally, employing robust surrogate modeling approaches, like ensemble models designed for noisy data, models that integrate uncertainty quantification, or regularization techniques, could further mitigate the impact of data imperfections. Adaptive strategies that dynamically detect and correct for noisy data, as well as active learning frameworks that selectively query the most informative samples, may also bolster the resilience of PS-SAFA in real-world applications. Second, the surrogate models and evolutionary operators used in PS-SAEA are specifically designed for continuous optimization problems. To extend its applicability to combinatorial EOPs, more suitable surrogate models and discrete evolutionary operators should be considered.
5. Conclusions
This paper aims to address the overfitting issue of surrogate models trained on limited evaluated data, a fundamental limitation in SAEAs. To enhance the generalization capability of surrogate-assisted optimization, this paper proposes a novel PS-SAEA that integrates a PMS and WME mechanism into the surrogate model management process. By employing PMS to probabilistically select models based on both accuracy and diversity and use WME to ensemble, the proposed PS-SAEA effectively mitigates overfitting and enhances the reliability of fitness approximation.
Extensive experimental evaluations on a suite of EOPs demonstrate that PS-SAEA consistently outperforms or performs comparably to several SAEAs. Moreover, ablation studies validate the essential role of the RWS strategy in enhancing surrogate generalization and guiding the evolutionary search toward more effective solutions.
In summary, PS-SAEA offers a simple yet effective improvement to surrogate-based optimization frameworks, providing a promising foundation for further advancements in SAEA research. The PS-SAEA offers promising potential for deployment in various practical fields such as engineering design and energy system optimization. In engineering design, e.g., car or airplane design, it can efficiently navigate complex, high-dimensional design spaces where function evaluations, such as physical testing or detailed simulations, are expensive and time-consuming. Similarly, in energy optimization, such as optimizing power plant operations and renewable energy scheduling, the algorithm can deliver high-quality solutions while significantly reducing the number of expensive evaluations. However, for practical use in real-world applications, it is recommended to preprocess data carefully, tune model parameters based on domain knowledge, and leverage parallel processing when needed. These strategies can help reliably achieve high-quality solutions with minimal computational cost.
Future work may explore different evaluation metrics (e.g., mean absolute error), adaptive ensemble updating, multitask learning, dimensionality reduction or sparse modeling techniques, multi-objective optimization, dynamic optimization, and the incorporation of uncertainty quantification to further enhance optimization robustness in scenarios with limited or noisy data. Furthermore, further application research is needed for real-world problems in all areas.