*2.2. Meta-Models*

A meta-model is a model for another model. That is, a much-simplified version of a given model that can provide, however, similar predictions as the original one for the same values of the parameters. In this work, we restrict our study to linear meta-models. To describe them, let us assume that the model we are trying to simplify depends on *N<sup>p</sup>* parameters and we denote the quantity of interest, assumed for simplicity to be a scalar, as *<sup>y</sup>*. For any collection of parameters *<sup>x</sup>* <sup>∈</sup> <sup>R</sup> *Np* , we define *y*ˆ(*x*) to be the corresponding value of the quantity of interest and slightly abusing the notation we write

$$\mathfrak{Y}(\mathfrak{x}) = \left[ \mathfrak{Y}(\mathfrak{x}\_1), \mathfrak{Y}(\mathfrak{x}\_2), \dots, \mathfrak{Y}(\mathfrak{x}\_N) \right], \tag{7}$$

where *N* is the number of samples and *x* = [*x*1, *x*2, . . . , *xN*] is an array of *N* samples of the parameters. Then, a meta-model is a function *Y*ˆ : R *<sup>N</sup><sup>p</sup>* <sup>→</sup> <sup>R</sup> that approximates *<sup>y</sup>*<sup>ˆ</sup> and is of the form

$$\hat{Y}(\mathbf{x}) := \sum\_{k=1}^{N\_k} \eta\_k h\_k(\mathbf{x})\,. \tag{8}$$

In this equation, and later, *h<sup>k</sup>* are the kernels of the approximation and *η<sup>k</sup>* refer to the weights. Abusing slightly the notation again, we express the relation between the model and its meta-model as

$$\mathcal{Y}(\mathbf{x}) \approx \hat{Y}(\mathbf{x}) = \sum\_{k=1}^{N\_k} \eta\_k h\_k(\mathbf{x}),\tag{9}$$

or in compact form,

$$\mathfrak{Y}(\mathfrak{x}) \approx \hat{Y}(\mathfrak{x}) := H(\mathfrak{x})\eta \,. \tag{10}$$

where *H* is the so-called kernel matrix that collects all the kernel functions.

The precise definition of a meta-model depends, hence, on the number and type of kernel functions *h<sup>k</sup>* and the value of the regression coefficients *η<sup>k</sup>* . Given an a priori choice for the kernels, the weights can be obtained from a set of model evaluations *y*ˆ(*x*) employing a least-squares minimization. Given, as before, an array of sample parameters *x* and their model evaluation *y*ˆ(*x*), the vector *η* can be calculated in closed form with the solution of the normal equations to the approximation. That is,

$$\eta = \left( H(\mathfrak{x})^T H(\mathfrak{x}) \right)^{-1} H(\mathfrak{x})^T \mathcal{Y}(\mathfrak{x}). \tag{11}$$

As previously indicated, the kernel functions belong to a set that must be selected a priori. In the literature, several classes of kernel have been proposed for approximating purposes and in this work we select anisotropic Radial Basis Functions (RBF). This type of kernel has shown improved accuracy as compared with standard RBF, particularly when only a limited set of model evaluations is available [23].

A standard RBF is a map *K* : R *<sup>N</sup><sup>p</sup>* <sup>×</sup> <sup>R</sup> *<sup>N</sup><sup>p</sup>* <sup>→</sup> <sup>R</sup> of the form

$$K(\mathbf{x}, z) = k(r(\mathbf{x}, z)),\tag{12}$$

where *<sup>k</sup>* : <sup>R</sup> <sup>→</sup> <sup>R</sup> is a monotonically decreasing function and *<sup>r</sup>*(*x*, *<sup>z</sup>*) :<sup>=</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>z</sup>*<sup>|</sup> is the Euclidean distance. The anisotropic radial kernels redefine the function *K* to be of the form

$$K(\mathbf{x}, \mathbf{z}) = \exp[-\boldsymbol{\epsilon} \sum\_{i=1}^{N\_d} \gamma\_i^2 (\mathbf{x}\_i - z\_i)^2] = \exp[-\boldsymbol{\epsilon}(\mathbf{x} - \mathbf{z})^T \boldsymbol{\Gamma}(\mathbf{x} - \mathbf{z})],\tag{13}$$

where Γ = diag(*γ* 2 1 , . . . , *γ* 2 *Nd* ) is a diagonal, positive definite matrix that scales anisotropically the contribution of each direction in the difference *<sup>x</sup>* <sup>−</sup> *<sup>z</sup>*, and *<sup>ǫ</sup>* <sup>&</sup>gt; 0 is the shape parameter of the kernel.

#### *2.3. Bayesian Inference and Gaussian Processes*

Bayesian inference is a mathematical technique used to improve the knowledge of probability distributions extracting information from sampled data [24]. It is successfully employed for a wide variety of applications in data science and applied sciences. For instance, it has been used in conjunction with techniques such as machine learning and deep learning in fields like medicine [25], robotics [26], earth sciences [27], and more. In this article, we employ Bayesian methods to find the optimal value of model parameters as a function of prior information and observed data. More specifically, we are concerned with model calibration and using it in combination with meta-models to obtain realistic parameter values for complex constitutive relations and their uncertainty quantification, with an affordable computational cost.

Some of the most robust techniques for calibration are based on non-parametric models for nonlinear regression [28]. Here, we will employ Gaussian processes to represent in an abstract fashion the response of a simulation code to a complex mechanical problem employing the material models that we are set to study. We summarize next the main concepts behind these processes.

A Gaussian process is a set of random variables such that any finite subset of them has a Gaussian multivariate distribution [28]. Such a process is completely defined by its mean and covariance, which are functions. If the set of random variables is indexed by points *<sup>x</sup>* ∈ X ⊂ <sup>R</sup>*<sup>d</sup>* , then when the random variables have a scalar value *f*(*x*), the standard notation employed is

$$f(\mathbf{x}) \sim \mathcal{GP}\left(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')\right),\tag{14}$$

where *<sup>m</sup>* : X → <sup>R</sup> and *<sup>k</sup>* : X × X → <sup>R</sup> are, respectively, the mean and covariance. The mean function can be arbitrary, but the covariance must be a positive function. Often these two functions are given explicit expressions depending on hyperparameters. In simple cases, the average is assumed to be zero, but often it is assumed to be of the form

$$\mathfrak{m}(\mathfrak{x}) = \mathfrak{g}(\mathfrak{x})^T \mathfrak{g} \,. \tag{15}$$

where *<sup>g</sup>* : <sup>R</sup>*<sup>d</sup>* <sup>→</sup> <sup>R</sup>*<sup>g</sup>* is a vector of known basis functions and *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>g</sup>* is a vector of basis coefficients. The choice of the covariance function is the key aspect that determines the properties of the Gaussian process. It is often selected to be stationary, that is, depending only on a distance *d* = ˆ*d*(*x*, *x* ′ ). In particular, we will employ a covariance function such as

$$c(\mathbf{x}, \mathbf{x}') = \sigma^2 r(\mathbf{x}, \mathbf{x}'),\tag{16}$$

where *σ* 2 is a variance hyperparameter and *<sup>r</sup>* : <sup>R</sup>*<sup>d</sup>* <sup>×</sup> <sup>R</sup>*<sup>d</sup>* <sup>→</sup> <sup>R</sup> has been chosen as the Màtern *<sup>C</sup>*5/2 function, an isotropic, differentiable, stationary kernel commonly used in statistical fitting that is of the form

$$r(\mathbf{x}, \mathbf{x}') = \left(1 + \sqrt{5} \frac{\hat{d}^2(\mathbf{x}, \mathbf{x}')}{\Psi} + 5 \frac{\hat{d}^2(\mathbf{x}, \mathbf{x}')}{3\psi^2} \right) \exp\left[-\sqrt{3} \frac{\hat{d}^2(\mathbf{x}, \mathbf{x}')}{\Psi} \right],\tag{17}$$

that uses a length-scale hyperparameter *ψ*. For the Gaussian process described, the collection of hyperparameters can be collected as *χ* = (*β*, *σ* 2 , *ψ*).

Let us now describe in which sense Bayesian analysis can be used for model calibration. The value of a computer model depends on the value of some input variables *x* that are measurable, and some parameters *t* that are difficult to determine because they are not directly measurable. Let us assume that *t* = *θ* is the true value of the parameters, which is unknown and we would like to determine based on available data. Given some input variable *x*, a physical experiment of the problem we want to model will produce a scalar output *z* and it will verify

$$z = \eta(\mathfrak{x}, \mathfrak{g}) + \delta(\mathfrak{x}) + \varepsilon(\mathfrak{x}) \,. \tag{18}$$

In this equation, *η*(*x*, *θ*) is the value of the computer model evaluated at the input variable *x* and the true parameter *θ*, *δ*(*x*) is the so-called model inadequacy and *ε*(*x*) is the observation error. This last term can be taken to be a random variable with a Gaussian probability distribution N (0, *λ* 2 ). The functions *η* and *δ* are completely unknown so we can assume them to be Gaussian processes with hyperparameters *χ<sup>η</sup>* and *χ<sup>δ</sup>* , respectively.

If *θ*, *χ<sup>η</sup>* , *χ<sup>δ</sup>* , *λ* <sup>2</sup> were known, we could study the multivariate probability distribution of the output *z* using Equation (18) for any set of inputs (*x*1, *x*2, . . . , *xs*). However, we are interested in solving the inverse problem: we have a set of experimental and computational data and we would like to determine the most likely probability distribution for *θ* and the hyperparameters, a problem that can be effectively addressed using Bayes' theorem.

Bayes' theorem states that, given a prior probability for the parameters (*θ*, *χ<sup>η</sup>* , *χ<sup>δ</sup>* , *λ* 2 ) indicated as *p*(*θ*, *χ<sup>η</sup>* , *χ<sup>δ</sup>* , *λ* 2 ), the posterior probability density function for these parameters after obtaining the data ∆ is

$$p(\boldsymbol{\theta}, \boldsymbol{\chi}\_{\boldsymbol{\eta}^{\prime}} \boldsymbol{\chi}\_{\boldsymbol{\delta}^{\prime}} \boldsymbol{\lambda}^{2} | \boldsymbol{\Delta}) \propto p(\boldsymbol{\theta}, \boldsymbol{\chi}\_{\boldsymbol{\eta}^{\prime}} \boldsymbol{\chi}\_{\boldsymbol{\delta}^{\prime}} \boldsymbol{\lambda}^{2}) \ p(\boldsymbol{\Delta} | \boldsymbol{\theta}, \boldsymbol{\chi}\_{\boldsymbol{\eta}^{\prime}} \boldsymbol{\chi}\_{\boldsymbol{\delta}^{\prime}} \boldsymbol{\lambda}^{2}) \,. \tag{19}$$

The prior for the parameters and hyperparameters can be taken as Gaussian, or any other probability distribution that fits our initial knowledge. Assuming that the parameters and hyperparameters are independent, we have in any case that

$$p(\boldsymbol{\theta}, \boldsymbol{\chi}\_{\eta \prime} \boldsymbol{\chi}\_{\delta \prime} \boldsymbol{\lambda}^2) = p(\boldsymbol{\theta}) \ p(\boldsymbol{\chi}\_{\eta \prime} \boldsymbol{\chi}\_{\delta \prime} \boldsymbol{\lambda}^2) \,. \tag{20}$$

In addition, since Equation (18) indicates that the output is the sum of three random variables with Gaussian distributions, *z* itself is a Gaussian.

To apply Bayes' theorem, it remains to calculate the likelihood *<sup>p</sup>*(∆|*θ*, *<sup>χ</sup><sup>η</sup>* , *χ<sup>δ</sup>* , *λ* 2 ). To this end, the hyperparameters of *η* and *δ* are collected together with the observation error *ε* and we assume that the conditioned random variable <sup>∆</sup>|*θ*, *<sup>χ</sup><sup>η</sup>* , *χ<sup>δ</sup>* , *λ* <sup>2</sup> has a normal probability distribution of the form

$$N\{\mathbb{E}[\Delta|\boldsymbol{\theta},\boldsymbol{\chi}\_{\boldsymbol{\eta}\boldsymbol{\prime}},\boldsymbol{\chi}\_{\boldsymbol{\delta}\boldsymbol{\prime}}\boldsymbol{\lambda}^{2}],\mathrm{Var}[\Delta|\boldsymbol{\theta},\boldsymbol{\chi}\_{\boldsymbol{\eta}\boldsymbol{\prime}},\boldsymbol{\chi}\_{\boldsymbol{\delta}\boldsymbol{\prime}}\boldsymbol{\lambda}^{2}]\}.\tag{21}$$

Here, E[·] and Var[·] refer to the expectation and variance, respectively. Finally, in order to obtain the segregated posterior probability density of the parameters *<sup>p</sup>*(*θ*|∆), we should integrate out *<sup>χ</sup><sup>η</sup>* , *χ<sup>δ</sup>* and *λ* 2 ; but, due to the high computational cost involved, this is typically done using Monte Carlo methods [29]. Details of this process fall outside the scope of the present work and can be found in standard references [12].

#### *2.4. Material Models*

In Section 4, a sensitivity and calibration procedure is applied to three relatively complex material models employed in advanced industrial applications. Here, we summarize them, listing all the parameters involved in their description. The remainder of the article will focus on ranking the significance of these parameters, their influence in the predictions, and the determination of their probability distributions.

#### 2.4.1. Johnson–Cook Constitutive Relations

The Johnson–Cook (JC) constitutive model [17] is commonly used to reproduce the behavior of metals subjected to large strains, high strain rates, and high temperatures. It is not extremely accurate in all ranges of strain rates and temperatures, but it is simple to implement, robust, and, as a result, has been employed over the years in a large number of applications [30–32].

Johnson–Cook's model is a *J*<sup>2</sup> classical plasticity constitutive law in which the yield stress *σ<sup>y</sup>* is assumed to be of the form:

$$
\sigma\_y = (A + B\epsilon\_p^n)(1 + \mathcal{C}\log\epsilon\_p^\*)(1 - T^{\*m}),
\tag{22}
$$

where *ε <sup>p</sup>* refers to the equivalent plastic strain. The first term in expression (22) accounts for the quasistatic hardening, including the initial yield stress, *A*, and the constants representing the strain hardening, *B*, and the exponent *n*. The second term in (22) is related to the hardening due to strain rate effects, containing the strain rate constant, *C* and also the dimensionless plastic strain rate, *ε*˙ ∗ = *ε*˙ *p ε*˙ *<sup>p</sup>*<sup>0</sup> , where *ε*˙ *<sup>p</sup>*<sup>0</sup> is a reference plastic strain rate, often taken to be equal to 1. Finally, the third term accounts for the effects of temperature, including the thermal softening exponent, *m*, and also the so-called "homologous temperature"

$$T^\* = \frac{T\_{exp} - T\_{room}}{T\_{melt} - T\_{room}}.\tag{23}$$

Here, *Texp* is the experimental temperature at which the material is being modeled, *Tmelt* is the melting temperature of the material and *Troom* is the ambient temperature.

#### 2.4.2. Zerilli–Armstrong Constitutive Relations

The Zerilli–Armstrong (ZA) model [18] was conceived as a new approach to the metal plasticity modeling, using a finite deformation formulation based on constitutive relations related to the physical phenomenon of dislocation mechanics, in contrast to other purely phenomenological constitutive relations, such as the previously described Johnson–Cook model. These relations have been proved to be well suited for the modeling of the response of metals to high strains, strain rates, and temperatures. Its numerical implementation, although more complicated than the JC model, is still relatively simple, justifying its popularity.

The ZA relations were developed in order to respond to the need of a physical-based model that could include the high dependence of the flow stress of metals and alloys on dislocation mechanics. For instance, aspects like the grain size, thermal activation energy, or the characteristic crystal unit cell structure have a dramatic effect in the plastic response of these materials, according to experimental data. Hence, the ZA model is still a *J*<sup>2</sup> plasticity model in which the yield stress becomes a function of

strain, strain rate, and temperature, but with their relative contributions weighted by constants that have physical meaning. The yield stress is assumed to be

$$
\sigma\_y = (\mathbb{C}\_1 + \mathbb{C}\_2 \varepsilon\_p^{\frac{1}{2}}) \exp(-\mathbb{C}\_3 T + \mathbb{C}\_4 T \log \varepsilon\_p^\circ) + \mathbb{C}\_5 \varepsilon\_p^\imath + kl^{-\frac{1}{2}} + \sigma\_G. \tag{24}
$$

In this relation, *C*1, *C*2, *C*3, *C*4, *C*5, *k*, *σG*, *l* are constants. The constants *σG*, *k*, *l* represent, respectively, the contributions to the yield stress due to solutes, the initial dislocation density, and the average grain diameter. The remaining constants are selected to distribute the contribution to the hardening of the plastic strain, its rate, and the temperature. Based on the crystallographic structure of the metal under study, some of the constants *C<sup>i</sup>* will be set to zero. For example, fcc metals such as copper will have *C*<sup>1</sup> = *C*<sup>5</sup> = 0. Iron and other bcc metals will be represented with equations that have *C*<sup>2</sup> = 0. These differences are mainly based on the physical influence of the effects of strain on each type of structure, which is especially dominant when it comes to modeling fcc metals, whereas the strain-rate hardening, thermal softening, and grain size have a greater effect on bcc metals.
