*2.2. Dataset*

The nodal data from our Abaqus FEM simulations constitute the datasets. For each use case, the nodal data are split into training and test dataset, respectively. The training dataset *D* = {*X*1,..., *Xn*} with number of training instances *n* and the test dataset *T* = {*Xn*+1,..., *Xn*+*m*} with number of test instances *m* are generated from several FEM simulations; see Tables 2 and 6, where bold marked simulations belong to *T* and the remaining to *D*. Thus, we split our data due to different generalization variables and not randomly. We denote each instance with index *i*, *i* ∈ {1, 2, ... , *n* + *m*}. An instance *Xi* = (*xi*, *yi*) is generated of an input vector *xi* <sup>∈</sup> <sup>R</sup>*<sup>p</sup>* and output vector *yi* <sup>∈</sup> <sup>R</sup>*q*. Each input vector *xi* is composed of the initial *x*- and *y*-coordinates of a FEM node and the respective generalization variable (i.e., perforated plate: *Diameter*, beam: *Yield Stress*, block with four perforations: *Width* and *Yield Stress*) of the FEM simulation; see Table 4. Thus, we have *p* = 3 in the plate and beam use case, and *p* = 4 in the block use case.

**Table 4.** Surrogate model input variables. Data obtained from FEM simulations are transformed so that each FEM node (represented by its x- and y-coordinates) with the respective generalization variable is an instance.


In our setting, each output vector *yi* contains 13 (*q* = 13) output variables obtained from FEM simulation with input *xi*, namely the *ε<sup>t</sup> xx*, *ε<sup>t</sup> xy* and *ε<sup>t</sup> yy* total strain components, the *ε p xx*, *ε p xy*, *ε p yy* and *ε p zz* plastic strain components, the *σxx*, *σxy*, *σyy* and *σzz* principal and shear stress components and the displacement in x- and y-directions *u* and *v* of each node; see Table 5 and Figure 4. We split the data in a training and test dataset (see Table 6) and standardized the data by removing the mean and scaling to unit variance.

**Table 5.** Surrogate model output variables. For each input FEM node, a surrogate model predicts its respective strains, stresses and displacements.


In Figure 4, we present graphical results with visible mesh obtained from Abaqus FEM simulation of the output variables used for a block use case.

**Table 6.** Dataset splits: number of training instances *n* and test instances *m* due to the data generation from Table 2.


**Figure 4.** Block use case: Abaqus FEM results that our surrogate models should predict.
