*5.4. Strategies for Optimization of Hyperparameters in Evolutionary Learning Algorithm*

As it was noted in the issue described in Issue 4, the very large search space is a major problem in the generative design. To prove that it can be solved with the application of the specialized hyperparameters tuning strategies, a set of experiments was conducted.

As can be seen from Figure 6, the direct tuning strategy means that each atomic model is considered an autonomous model during tuning. The computational cost of the tuning is low in this case (since it is not necessary to fit all the models in a chain to estimate the quality metric), but the found set of parameters can be non-optimal. The composite model tuning allows to take into account the influence of the chain beyond the scope of an individual atomic model, but the cost is additional computations to tune all models. A pseudocode of an algorithm for composite model tuning is represented in Algorithm 1.


The results of the model-supported tuning of the composite models for the different regression problems obtained from PMLB benchmark suite (Available in the https:// github.com/EpistasisLab/pmlb) are presented in Table 1. The self-developed toolbox that was used to run the experiments with PMLB and FEDOT is available in the open repository (https://github.com/ITMO-NSS-team/AutoML-benchmark). The applied tuning algorithm is based on a random search in a pre-defined range.

**Table 1.** The quality measures for the composite models after and before random search-based tuning of hyperparameters. The regression problems from PMLB suite [45] are used as benchmarks.


It can be seen that the hyperparameter optimization allow increasing the quality of the models in most cases.

#### *5.5. Estimation of the Empirical Performance Models*

The experiments for the performance models identification (this problem was raised in the issue described in Issue 5) were performed using the benchmark with a large number of features and observations in the sample. The benchmark is based on a classification task from the robotics field. It is quite a suitable example since there is a large number of tasks in this domain that can be performed on different computational resources from the embedded system to supercomputer in robotics. The analyzed task is devoted to the manipulator grasp stability prediction obtained from the Kaggle competition (https: //www.kaggle.com/ugocupcic/grasping-dataset).

An experiment consists of grasping the ball, shaking it for a while, while computing grasp robustness. Multiple measurements are taken during a given experiment. Only one robustness value is associated though. The obtained dataset is balanced and has 50/50 stable and unstable grasps respectively.

The approximation of the EPM with simple regression models is a common way to analyze the performance of algorithms [46]. After the set of experiments, for the majority of considered models it was confirmed that the common regression surface of a single model EPM can be represented as a linear model. However, some considered models can be described better by another regression surface (see the quality measures for the different structures of EPM in Appendix A). One of them is a random forest model EPM. According to the structure of the Equation (9), these structures of EPM can be represented as follows:

$$T^{EPM} = \begin{cases} \Theta\_1 \mathcal{N}\_{obs} \mathcal{N}\_{fat} + \Theta\_2 \mathcal{N}\_{obs}, & for \text{ the common case} \\ \frac{\mathcal{N}\_{obs}}{\Theta\_1^2} + \frac{\mathcal{N}\_{obs}^2 \mathcal{N}\_{fat}}{\Theta\_2^2}, & specific \text{ case for random forces} \end{cases} \tag{23}$$

where *TEPM*—model fitting time estimation (represented in ms according to the scale of coefficients from Table 2), *Nobs*—number of observations in the sample, *Nf eat*—number of features in the sample. The characteristics of the computational resources and hyperparameters of the model are considered as static in this case.

We applied the least squared errors (LSE) algorithm to (23) and obtained the Θ coefficients for the set of models that presented Table 2. The coefficient of determination *R*<sup>2</sup> is used to evaluate the quality of obtained performance models.


**Table 2.** The examples of coefficients for the different performance models.

The application of the evolutionary optimization to the benchmark allows finding the optimal structure of the composite model for the specific problem. We demonstrate EPM constructing for the composite model which consists of logistic regression and random forest as a primary nodes and logistic regression as a secondary node. On the basis of (11), EPM for this composite model can be represented as follows:

$$T\_{\rm{Add}}^{\rm{EPM}} = \max(\Theta\_{1,1}N\_{\rm{obs}}N\_{\rm{fat}} + \Theta\_{2,1}N\_{\rm{obs}}\Theta\_{1,2}^1N\_{\rm{obs}}N\_{\rm{fat}} + \Theta\_{2,2}N\_{\rm{obs}}) + \frac{N\_{\rm{obs}}}{\Theta\_{1,3}^2} + \frac{N\_{\rm{obs}}^2N\_{\rm{fat}}}{\Theta\_{2,3}^2},\tag{24}$$

where *TEPM Add* —composite model fitting time estimated by the additive EMP, Θ*i*, *j*-*i* coefficient of *j* model type for EPM according to the Table 2.

The performance model for the composite model with three nodes (LR + RF = LR) is shown in Figure 17. The visualizations for the atomic models are available in Appendix A.

**Figure 17.** Predictions of the performance model that uses an additive approach for local empirical performance models (EPMs) of atomic models. The red points represent the real evaluations of the composite model as a part of validation.

The RMSE (root-mean-squared-error) measure is used to evaluate the quality of chain EPM evaluation against real measurements. In this case, the obtained *RMSE* = 21.3 s confirms the good quality of obtained estimation in an observed 0–400 seconds range.

### **6. Discussion and Future Works**

In a wider sense co-design problem may be solved as an iterative procedure that includes additional tuning during the model execution stage and a cyclic closure (or rebuilding stage) with respect to time evolution. Re-building stage may be initiated by two types of events: (1) model error overcomes acceptable threshold *ec*; (2) execution time overcomes acceptable threshold *τc*. In this case a solution is to build the new model with respect to corrected set of structures *S*˜ and performance model *T*˜ *M*:

$$p^{\prime \min}(M^\*, t) > \rho\_{\varepsilon}, \ T\_{\varepsilon \varepsilon}^{\min} > \tau\_{\varepsilon}, \ \tilde{p}^{\min}(M^{\*\*}, t) = \max\_{\tilde{M}} F^{\prime}(\tilde{M}, t \mid \quad \mathcal{T}\_M \le \tau\_{\varepsilon}, \ T\_{\mathcal{S}^{\rm em}} \le \tau\_{\mathcal{S}}), \tag{25}$$

where *t* is a variable of real time and *ρ<sup>c</sup>* is a critical threshold for values of error function *E*. Such a problem is typical for models that are connected with a lifecycle of their prototype, e.g., models inside digital shadow for industrial system [47], weather forecasting models [48], etc.

Additional fitting of co-designed system may appear also on the level of model execution where classic scheduling approach may be blended with model tuning. Classic formulation of scheduling for resource intensive applications *Texmin*(*L*∗) = min *<sup>A</sup> <sup>G</sup>* (*L*|*M*, *I*) is based on idea of optimization search for such algorithm *L*∗ that helps to provide minimal computation time *Texmin* for model execution process through balanced schedules of workload on computation nodes. However, such approach is restricted by assumption of uniform performance models for all parts of application. In real cases performance of application may change dynamically in time and among functional parts. Thus, to reach more effective execution it is desirable to formulate optimization problem with respect to possibility of tuning model characteristics that influence on model performance:

$$T\_{\rm cr} \max\left(\left\{a\_{1:\left|S\right|}\right\}^\*, L^\*\right) = \max\_{a,L} G\left(M\left(\left\{a\_{1:\left|S\right|}\right\}\right), L\middle| I\right), \; M = S^\*, E^\*, \left\{a\_{1:\left|S\right|}\right\}, \quad L = \left\{L\_{\rm ll}\right\},\tag{26}$$

where *G* is objective function that characterize expected time of model execution with respect to used scheduling algorithm *L* and model *M*. In the context of generative modeling problem on the stage of execution model *M* can be fully described as a set of model properties that consists of optimal model structure: optimal functions *S*∗ (from previous stage) and additional set of performance influential parameters *a*1:|*S*<sup>|</sup> . Reminiscent approaches can be seen in several publications, e.g., [49].

## **7. Conclusions**

In this paper, we aimed to highlight the different aspects of the creation of mathematical models using automated evolutionary learning approach. Such approach may be represented from the perspective of generative design and co-design for mathematical models. First of all, we formalize several actual and unsolved issues that exist in the field of generative design of mathematical models. They are devoted to different aspects: computational complexity, performance modeling, parallelization, interaction with the infrastructure, etc. The set of experiments was conducted as proof-of-concept solutions for every announced issue and obstacle. The composite ML models obtained by the FE-DOT framework and differential equation-based models obtained by the EPDE framework were used as case studies. Finally, the common concepts of the co-design implementation were discussed.

**Author Contributions:** Conceptualization, A.V.K. and A.B.; Investigation, N.O.N., A.H., M.M. and M.Y.; Methodology, A.V.K.; Project administration, A.B.; Software, N.O.N., A.H., and M.M.; Supervision, A.B.; Validation, M.M.; Visualization, M.Y.; Writing–original draft, A.V.K., N.O.N. and A.H. All authors have read and agreed to the final publication of the manuscript.

**Funding:** This research is financially supported by the Ministry of Science and Higher Education, Agreement #075-15-2020-808.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

