*3.1. Data Preparation*

For correcting the coherent residual stress profiles, stress values are discretized over the depth in the form *d*(*i*)/*dmax*, leading to 47 points from *d*(0.1) to *d*(4.7), as surface stresses are disregarded (specimen thickness = 4.8 mm). Included in the input is the information of the known pressure pulses used to generate each residual stress profile with the semi-analytical model. The maximum pressure *Pmax* of the particular pulse serves as normalization for all stress values of the respective profile. Since residual stress profiles can

converge towards zero, a division by zero or very small stress values during normalization is prevented by a uniform shift of all residual stress values above zero by adding twice the material's yield strength (note that the quasi-static yield strength *A* is used, here) denoted with *σy*, see Equations (1) and (2). To enable the prediction of correction factors that produce results of high accuracy, one point of the depth discretization is considered at a time. Thus, the depth at which the correction factor for the residual stresses shall be determined is used as the final input. This yields the following dimensionless input space, including shifted and scaled residual stresses over depth:

$$\mathbf{X}^{i} := \left\{ \frac{\sigma\_{\text{ana},1}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \frac{\sigma\_{\text{ana},2}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \dots, \frac{\sigma\_{\text{ana},47}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \frac{j}{47} \right\} \tag{1}$$

with *i* as the sample number, *j* as the discretization step of the depth in the range from 0.1 mm to 4.7 mm and *P i max* as the maximum pressure of the specific sample. The dimensionless output is the correction factor and defined as:

$$\mathbf{Y}^{i} := \left\{ \frac{\sigma\_{ana,j}^{i} + 2\sigma\_{y}}{\sigma\_{FE,j}^{i} + 2\sigma\_{y}} \right\} \tag{2}$$

with *σ i ana*,*j* and *σ i FE*,*j* being the residual stresses at the depth *j*/47, computed by semianalytical model and finite element model, respectively. Using a single output, where each output corresponds to a different depth *j*/47, one can observe a smooth curve as a result for the complete continuous distribution since the ANN is forced to smoothly approximate this depth dependency. A smooth curve of the output is obtained when the input is scanned with *j*/47 through the depth. With each *j* value, a corresponding stress correction at the output is received. The use of physically normed inputs and outputs allows for making predictions in a much wider process parameter range than that used for training of the ANN [46], which can be highly beneficial.

A total number of 82 numerical and semi-analytical sample pairs with pressure pulse parameter ranges listed in Table 2, have been utilized. With the proposed depth discretization of 47, this led to a total of 82 × 47 = 3854 patterns that composed the complete data set. The data is randomly split into training, validation and test data sets with an 80/10/10 ratio with the constraint of a stratified *Pmax* value range into eight classes, i.e., equidistant subintervals from 800 MPa to 2200 MPa. Thereby, each class is represented in the respective data sets to ensure the ranges of maximum pulse values are similar in training, validation and test data sets, respectively. Ultimately, training, validation and test data sets consisted of 3102, 376 and 376 patterns, respectively. Scaling of inputs and outputs was executed to remain in value ranges of [−1, 1] and [1, 5], respectively. The corrected residual stresses are obtained by solving Equation (2) with respect to absolute values *σ i ana*,*j* .

**Table 2.** Pressure pulse parameter ranges of maximum pressure *Pmax*, time of maximum pressure *t<sup>I</sup>* and pulse duration *tI I* for training, validation and test data sets.


#### *3.2. Hyperparameters of ANN*

The ANN consists of two hidden layers each containing 30 neurons, respectively. The sigmoid function is utilized as the activation function of each layer, except for the final layer, where a linear activation function is implemented to obtain continuous values in the proposed regression task. Gradient descent during mean squared error (MSE)-loss optimization through weight adjustments is enhanced with an adaptive learning rate according to the Adam optimizer. Furthermore, early stopping is implemented to enable

training without any overfitting, as training is stopped as soon as the generalization error, i.e., MSE-loss on validation data set, is not decreased any further. Before early stopping is executed, a patience of 1000 further epochs is used to assure that no local minimum on the validation set MSE within this consecutive 1000-epoch-range leads to the stopping. The workflow of this study, consisting of data pre-processing, ANN development and result analysis, has been executed with the open-source libraries Scikit-learn and Keras in conjunction with JupyterNotebook frontend and Tensorflow back-end.

#### **4. Development and Evaluation of ANN-Correction Model**

The ANN correction model proposed here is developed and evaluated in two steps. First, the input feature space only contains semi-analytical residual stresses distributed over depth, normalized with the maximum of the corresponding pulse pressure, where the correction predictions still exhibit significant errors. Second, the input feature space is enriched with additional salient features according to a consistent dimensionality analysis, which led to a decrease of those prediction errors. The prediction performances are evaluated with two metrics: Determination coefficient (*R* 2 ) and mean squared error (MSE). *R* 2 is defined as

$$R^2 = 1 - \frac{\sum\_{i=1}^{N} \left( y\_i - y\_{i,pred} \right)^2}{\sum\_{i=1}^{N} \left( y\_i - y\_{mean} \right)^2} \,, \tag{3}$$

where *y<sup>i</sup>* represents the true value, *yi*,*pred* the predicted value, *ymean* the mean of the true values and *N* the number of sample values. MSE is defined as

$$MSE = \frac{1}{N} \sum\_{i=1}^{N} \left( y\_i - y\_{i,pred} \right)^2. \tag{4}$$

#### *4.1. Approach 1: Consideration of Only Semi-Analytical Residual Stresses as Input*

In this first approach, the input for the corrective ANN prediction consists only of the semi-analytically determined residual stresses, normalized with the maximum pressure value of the pulse, Equation (1). The so-called "learning curves", i.e., values of the loss function (the MSE) on training and validation data sets during training (over the number of epochs), shown in Figure 6a, indicate a significantly lower MSE for predictions on the training data than on the validation data. In other words, the network has been over-fitted to the training data and shows low ability to generalize well, as the prediction error is increased on data points outside the training data set. Correspondingly, the *R* <sup>2</sup> values for the correction factor, presented in Figure 6b, and the resulting residual stresses, shown in Figure 6c, exhibit deviations between true/desired values and predicted values. Specifically, *R* <sup>2</sup> values for the correction factor, Equation (2), reached 97.08%, 96.65% and 94.94% on training, validation and test set, respectively, see Table 3. For the predictions of corrected residual stresses, these deviations are even greater, with *R* <sup>2</sup> values of 91.14%, 91.35% and 81.88% for training, validation, and test sets, respectively, see Table 3. Comparisons of input, output and corrected residual stresses of three exemplary test samples are shown in Figure 7, where the corrections of the semi-analytical stresses are not in good agreement with the desired FE solutions. The error of the stress predictions is decreased through the correction but not to a satisfactory extend. In order to improve corrective model predictions with respect to an increased determination coefficient *R* <sup>2</sup> and a decreased MSE, additional information needs to be provided in the input space for the ANN.

**Figure 6.** (**a**) Learning curves: Mean squared error (MSE)-loss function values minimized via weight adjustment of the ANN on training set and simultaneous MSE for predictions on validation set with training-set weights over number of epochs during training. (**b**) Determination coefficient *R* 2 for correction factor (ANN output) achieved by ANN on training, validation and test data sets. (**c**) Determination coefficient *R* 2 for related residual stresses attained by ANN on training, validation and test data sets.

**Table 3.** Prediction metrics of trained ANN via Approach 1: *R* 2 (determination coefficient) and MSE (mean squared error) for correction coefficients as well as for corresponding residual stresses on training, validation and test data sets, respectively.


**Figure 7.** Comparison of residual stress distributions over depth predicted by the FE model, semi-analytical model and hybrid model for three exemplary test samples with pulse parameters maximum pressure *Pmax*, time of maximum pressure *t<sup>I</sup>* and pulse duration *tI I* of (**a**) 1236 MPa, 15.1 ns, 85 ns; (**b**) 1639 MPa, 37.7 ns, 145 ns; and (**c**) 1820 MPa, 13 ns, 65.7 ns.

As mentioned in Section 2.2.1, the pulse duration *tI I* is not considered in the semianalytical model according to its input definition, the pressure pulse duration is only considered until *t<sup>I</sup>* . As a result, samples whose corresponding pressure pulses differ uniquely only in duration will cause predictions of identical residual stress distributions, see Figure 8. Mathematically, this is a non-injective relationship, inadequate to be represented by any function, i.e., the same input could certainly not be correlated to multiple different outputs via the ANN-model of this first approach, where only residual stresses over depth serve as input. Consequently, as pulse duration *tI I* affects the prediction result, it needs to be considered in the input space for the corrective model.

**Figure 8.** (**a**) Super-imposed but indistinguishable residual stress distributions over depth predicted by the semi-analytical model for different pressure pulses, i.e., identical inputs for the corrective ANN-model. (**b**) Corresponding output targets: Eight unique residual stress distributions over depth predicted by the FE model and (**c**) corresponding distinctive pressure pulses over time that were used as input for both models, exhibiting different pulse durations but identical maximum pressures and times of respective maximum pressures.

#### *4.2. Approach 2: Adding Salient Features to the Input Space*

In order to enable a unique mapping between inputs and outputs, additional input features are identified via a dimensionality analysis and are added to the input space. In accordance with the Buckingham Π theorem [47], a required minimum number of dimensionless parameters can be defined to sufficiently describe the physical problem. Thus, besides the analytical stresses *σana* and maximum pressure *Pmax*, the pressure pulse time quantities *t<sup>I</sup>* , *tI I* and *tI I I* are included. To connect those temporal measures to mechanical properties *E* and *ρ*, the wave speed *c* = p *E*/*ρ* is also considered. Ultimately, the peened area *Apeened* is used to complete the set of five dimensionless quantities:

$$\Pi\_1 = \frac{\sigma\_{\text{ana}}}{P\_{\text{max}}}, \,\Pi\_2 = \frac{t\_I}{t\_{II}}, \,\Pi\_3 = \frac{t\_{III}}{t\_{II}}, \,\Pi\_4 = t\_{III} \sqrt{\frac{E}{\rho \cdot A\_{pen}}}, \,\Pi\_5 = \frac{P\_{\text{max}}}{E}.\tag{5}$$

Adding dimensionless information that is based on a consistent dimensionality analysis to the input space leads to a reduction of inaccuracies, which is in agreement with a study based on a similar input definition for an ANN [46]. Subsequently, this leads to a further reduction of prediction's MSE and increase of R 2 compared to the first approach presented in Section 4.1. All input-output pairs can be uniquely identified by the ANN. Accordingly, the modified input is described with:

$$\mathbf{X}^{i} := \left\{ \frac{\sigma\_{\text{ana},1}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \frac{\sigma\_{\text{ana},2}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \dots, \frac{\sigma\_{\text{ana},27}^{i} + 2\sigma\_{y}}{P\_{\text{max}}^{i}}, \frac{\mathbf{t}\_{\text{I}}^{i}}{\mathbf{t}\_{\text{II}}^{i}}, \frac{\mathbf{t}\_{\text{III}}^{i}}{\mathbf{t}\_{\text{II}}^{i}}, \mathbf{t}\_{\text{III}}^{i} \sqrt{\frac{\mathbf{E}}{\rho \cdot A\_{\text{open}}}}, \frac{\mathbf{P}\_{\text{max}}^{i}}{\mathbf{E}}, \frac{\mathbf{j}}{47} \right\}. \tag{6}$$

This dimensionless formulation ensures that all dependencies are scaled without loss of generality. In comparison to the first approach, the previous bias and variance indicated in the learning curves in Figure 6a is reduced, as the final MSE-loss is further reduced on both training as well as on validation data sets, respectively, and both converged towards similar values, see Figure 9a. Hence, prediction results improved significantly on all three data sets, with respect to increased determination coefficients *R* 2 each above 99% for the correction factors, see Figure 9b and also for the corrected residual stresses, see Figure 9c. The MSE on the test set declined simultaneously to a maximum of 3.9 × 10 <sup>−</sup><sup>5</sup> and 28.63 MPa 2 for correction factors and corrected residual stresses, respectively, see Table 4. There is good agreement between the corrected prediction and the desired values of the residual stresses throughout the complete depth, as demonstrated by three examples from the test data set in Figure 10.

**Figure 9.** (**a**) Learning curves: MSE-loss function values on training and validation data sets over number of epochs during training and (**b**) corresponding prediction values of the correction factor versus true values, and of (**c**) the corresponding residual stresses.

**Table 4.** Prediction metrics of the trained ANN via Approach 2: Determination coefficient *R* <sup>2</sup> and MSE for correction coefficients as well as corresponding residual stresses achieved on training, validation and test data sets, respectively.


**Figure 10.** Comparison of residual stress distributions over depth predicted by the FE model, semi-analytical model and hybrid model for three test samples with maximum pressure *Pmax*, time of maximum pressure *t<sup>I</sup>* and pulse duration *tI I* of (**a**) 1144 MPa, 38.9 ns, 137 ns; (**b**) 1390 MPa, 22.2 ns, 140 ns; and (**c**) 2039 MPa, 49.5 ns, 243 ns, respectively.

#### **5. Generalization of Hybrid Model**

An evaluation of the generalization ability is performed by expanding the input parameter space, i.e. value ranges of pressure pulse parameters: Maximum pressure *Pmax*, time of maximum pressure *t<sup>I</sup>* and pulse duration *tI I* , to respective ranges that were not used for training, validation and testing, as shown in Figure 11 and Table 5. The lower bound of the maximum pressure range remained at 800 MPa because there is an almost insignificant contribution to residual stress formation by pressure pulses with a maximum below 800 MPa. In addition, extension of maximum pressures above 2400 MPa becomes physically unfeasible. Ultimately, there is no significant expansion but only minor exceedances for *Pmax* values beyond the training space. Lower bounds of pulse durations were decreased from 12 ns to 1 ns and upper bounds increased from 66 ns to 100 ns. The expanded-space data-set contained 35 samples. With this expanded parameter space, deviations between semi-analytical and high-fidelity solutions can be adequately corrected by the ANN and its trained range of correction factors.

The "learned" range for the correction factors is [0.5090, 1.1189]; thus, the deviation between analytical and numerical model has to be correctable by values within that range in order to achieve the anticipated solutions. Restrictions are inevitable when the required factor for an appropriate correction lies outside this range. In this case, no correction is performed by the ANN and the analytical input is also the output. This corresponds to setting the correction factor to 1.0. Thus, the default prediction, in a worst-case scenario, is the provided input—the prediction of the semi-analytical model, which can be noticed clearly and used as an indicator for no correction having been performed. Essentially, an extrapolating prediction on an expanded parameter space can only be performed as long as the output of the ANN, i.e., the required correction factor, still lies in the value range of the training data set.

**Table 5.** Expanded pressure pulse parameter ranges of maximum pressure *Pmax*, time of maximum pressure *t<sup>I</sup>* and pulse duration *tI I* as extrapolated parameter space in comparison to the ranges in the data set used for training, validation and testing, see Table 2.


**Figure 11.** Sample positioning in the expanded parameter space: Maximum pressure over (**a**) pulse duration and over (**b**) time of maximum pressure as well as (**c**) time of maximum pressure over pulse duration.

#### *5.1. Setup of Purely Data-Driven ANN as Benchmark*

The prediction performance of the hybrid model is benchmarked against the estimations of a purely data-driven ANN trained directly with pressure-pulse-over-time as input and residual-stresses-over-depth provided by the FE-model as output, without the consideration of any physics-based model. In the following, this purely data-driven ANN is briefly explained. Essentially, no corrective task is performed and the input consists of 47 discretized pressure values and the respective terms defined in the dimensionality analysis with

$$\mathbf{X}\_{direct}^{i} := \left\{ \frac{\mathbf{P}\_{1}^{i}}{\mathbf{P}\_{1}^{i}\mathbf{max}}, \frac{\mathbf{P}\_{2}^{i}}{\mathbf{P}\_{1}^{i}\mathbf{max}}, \dots, \frac{\mathbf{P}\_{47}^{i}}{\mathbf{P}\_{1}^{i}\mathbf{max}}, \frac{\mathbf{t}\_{1}^{i}}{\mathbf{t}\_{II}^{i}}, \frac{\mathbf{t}\_{III}^{i}}{\mathbf{t}\_{II}^{i}}, \mathbf{t}\_{III}^{i} \sqrt{\frac{\mathbf{E}}{\rho \cdot A\_{p\text{en}}}}, \frac{\mathbf{P}\_{\text{max}}^{i}}{\sigma\_{\text{y}}}, \frac{j}{47} \right\}. \tag{7}$$

The output space contains the residual stress values, where constant discretization over specimen depth of the residual stresses is used, similar to the output discretization of the output space for the corrective model by

$$\mathbf{Y}\_{direct}^{i} := \left\{ \frac{\sigma\_{\rm FE,j}^{i} + 2\sigma\_{y}}{\sigma\_{y}} \right\} \tag{8}$$

where superscript *i* refers to the sample number and subscript *j* to the depth discretization step of 0.1 mm in the range from 0.1 mm to 4.7 mm.

The previous ANN architecture consisting of two hidden layers with respective 30 neuron and sigmoid activation functions is used to avoid any artificial influence of the ANN architecture in the benchmark. Likewise, early stopping is implemented to avoid overfitting during training. Normalization of inputs to [−1, 1] and outputs to [1, 5] is performed, as for the hybrid model.
