*2.1. Dynamic Co-Design and Optimization Problem and Its Direct Solving Method*

The objective of the dynamic co-design and optimization (DCDO) problem is to solve the optimal input vectors of physical design parameters **x**∗ *<sup>p</sup>* and control variables **u**∗(*t*) that minimizes the system performance index. The standard form of DCDO based on the simultaneous method is described as follows [23,24]:

$$\begin{array}{ll}\min\_{\mathbf{x}\_{p},\mathbf{u}(t)} & J(\mathbf{x}\_{p},\mathbf{u}(t)) = \boldsymbol{\phi}(\mathbf{x}\_{p},\boldsymbol{\xi}(t\_{0}),\boldsymbol{\xi}(t\_{f}),t\_{0},t\_{f}) + \int\_{t\_{0}}^{t\_{f}} L(\mathbf{x}\_{p},\boldsymbol{\xi}(t),\mathbf{u}(t),t)dt\\ \text{s.t.} & \dot{\boldsymbol{\xi}}(t) = \mathbf{f}(\mathbf{x}\_{p},\boldsymbol{\xi}(t),\mathbf{u}(t),t) \\ & \mathbf{Y}(\boldsymbol{\xi}(t\_{0}),t\_{0},\boldsymbol{\xi}(t\_{f}),t\_{f}) = 0 \\ & \mathbf{g}(\mathbf{x}\_{p},\boldsymbol{\xi}(t),\mathbf{u}(t),t) \leq 0 \\ & \mathbf{u}\_{L} \leq \mathbf{u}(t) \leq \mathbf{u}\_{L} \\ & \mathbf{x}\_{L} \leq \mathbf{x}\_{p} \leq \mathbf{x}\_{U} \end{array} \tag{1}$$

where *J* is the response of a cost function that consists of a Mayer term *φ*(·) and Lagrange term *L*(·). *t* ∈ [*t*0, *tf* ] denotes the time horizon, and *t*<sup>0</sup> and *tf* are initial time and terminal times. ξ(*t*0) and ξ(*tf*) indicate the initial and terminal states of the system, while ξ(*t*) and **u**(*t*) are vectors of the state variables and control inputs at moment *t*. DCDO is subject to several different constraints such as state equations **.** ξ(*t*) = **f**(**x***p*, ξ(*t*), **u**(*t*), *t*), boundary constraints **Ψ**, and path constraints **g**. Among these three constraints, the state equations and path constraints are continuous constraints that must be satisfied during the entire time period, while the boundary constraints are discrete constraints that only need to be satisfied at the moments *t*<sup>0</sup> and *tf* . Finally, the vectors of the physical design parameters **x***<sup>p</sup>* and control inputs **u**(*t*) enforce the interval constraints [**x***L*, **x***U*] and [**u***L*, **u***U*], respectively.

In the direct solving approach of the DCDO, the vectors of the state variables ξ(*t*), plant design parameters **x***p*, and control inputs **u**(*t*) are discretized at the time grid nodes by means of the DT. Thus, the DCDO is transformed into an NLP, as shown below.

$$\begin{array}{ll}\min\_{\mathbf{x}\_{p},\mathbf{\Xi},\mathbf{\Theta}} & J(\mathbf{x}\_{p},\mathbf{\Xi},\boldsymbol{\Theta}) = \boldsymbol{\phi}(\mathbf{x}\_{p},\boldsymbol{\xi}(t\_{0}),\boldsymbol{\xi}(t\_{f}),t\_{0},t\_{f}) + \sum\_{k=0}^{N} \boldsymbol{\omega}\_{k} \cdot \boldsymbol{L}(t,\mathbf{x}\_{p},\boldsymbol{\Xi}(k),\boldsymbol{\Theta}(k)) \\ \text{s.t.} & \mathbf{D}\cdot\boldsymbol{\Xi} = \mathbf{f}(\mathbf{x}\_{p},\boldsymbol{\Xi},\boldsymbol{\Theta}) \\ & \mathbf{\overline{\boldsymbol{\xi}}}(\boldsymbol{\xi}(t\_{0}),t\_{0},\boldsymbol{\xi}(t\_{f}),t\_{f}) = 0 \\ & \mathbf{g}(\mathbf{x}\_{p},\boldsymbol{\Xi},\boldsymbol{\Theta}) \leq 0 \end{array} \tag{2}$$

where **Ξ** and **Θ** are the discrete matrices of the state variables and control inputs in the time domain, *ω<sup>k</sup>* is the integration weight, and **D** is the differential matrix specified in different pseudospectral methods [25,26]. When solving the BDCDO of the dynamic system based on the surrogate models of the derivative functions, the right hand-side functions of the state equations are approximated by the surrogate models. Hence, replacing the deriva-

tive function **<sup>f</sup>**(·) in Equation (2) with the surrogate model **^ f**(·) forms a computationally inexpensive expression:

$$\mathbf{D} \cdot \boldsymbol{\Xi} = \hat{\mathbf{f}}(\mathbf{x}\_p, \boldsymbol{\Xi}, \boldsymbol{\Theta}) \tag{3}$$

#### *2.2. Kriging Technique*

In view of the efficiency and effectiveness of the Kriging technique in approximating low-dimensional problems, as well as providing prediction errors at arbitrary points [27], the Kriging technique is adopted to construct the surrogate model in the BDCDO. When training the model, the state variables, physical design parameters, and control variables **X** = [**x***p*, ξ, **u**] *<sup>T</sup>* are fed into the model, and the output are the valuations of the derivative functions **f**(**X**) in the state equation.

The standard Kriging model has two main components: the regression function and the stochastic process [28]. The expression is formulated as follows:

$$\hat{f}(\mathbf{X}) = \mathbf{F}^T(\mathbf{X})\boldsymbol{\mathfrak{F}} + z(\mathbf{X}) \tag{4}$$

where **F***T*(**X**) contains a series of regression functions, and β is a trend coefficient vector. *z*(**X**) denotes a random process with a mean of 0 and variance *σ*<sup>2</sup> *<sup>z</sup>* . The spatial covariance function between the stochastic processes *z*(**X***i*) and *z*(**X***j*) can be expressed as

$$\text{cov}[z(\mathbf{X}\_{i}), z(\mathbf{X}\_{j})] = \sigma\_{z}^{2} \mathbf{G}(\boldsymbol{\Theta}, \mathbf{X}\_{i}, \mathbf{X}\_{j}) \tag{5}$$

where **X***<sup>i</sup>* and **X***<sup>j</sup>* are two different points in the design space, **G** is the Gaussian spatial correlation function, and θ is the corresponding vector of correlation coefficients. The values of **G** and **F** at sample points constitute matrices **R** = **G**(**X***i*,**X***j*) and **S** = **F**(**X***i*). According to the unbiased estimator theory, the least squares solution of the regression model is **^**

$$\overset{\cdots}{\beta} = \left(\mathbf{S}^T \mathbf{R}^{-1} \mathbf{S}\right)^{-1} \mathbf{S}^T \mathbf{R}^{-1} \mathbf{Y} \tag{6}$$

and the maximum likelihood estimation of variance is

$$
\boldsymbol{\sigma}\_z^2 = \frac{1}{m} (\mathbf{Y} - \mathbf{S}\hat{\boldsymbol{\beta}})^T \mathbf{R}^{-1} (\mathbf{Y} - \mathbf{S}\hat{\boldsymbol{\beta}}) \tag{7}
$$

where **Y** = [ ˆ *f*(**X***i*)], *i* = 1, 2, . . . , *m*.

Thus, Equation (4) is translated into

$$\hat{f}(\mathbf{X}) = \mathbf{F}^T(\mathbf{X})\boldsymbol{\mathfrak{F}} + \mathbf{r}^T(\mathbf{X})\mathbf{R}^{-1}(\mathbf{Y} - \mathbf{S}\boldsymbol{\mathfrak{F}}),\\\mathbf{r}(\mathbf{X}) = \left[\mathbf{G}(\boldsymbol{\Theta}, \mathbf{X}, \mathbf{X}\_1), \dots, \mathbf{G}(\boldsymbol{\Theta}, \mathbf{X}, \mathbf{X}\_m)\right]^T \tag{8}$$

Suppose **<sup>d</sup>** <sup>=</sup> **<sup>S</sup>***T***R**−1**<sup>r</sup>** <sup>−</sup> **<sup>S</sup>**, the predictor of the mean square error (MSE) is calculated as follows:

$$MSE = \sigma\_z^2 \left(1 + \mathbf{d}^T \left(\mathbf{S}^T \mathbf{R}^{-1} \mathbf{S}\right)^{-1} \mathbf{d} - \mathbf{r} \mathbf{R}^{-1} \mathbf{r}\right) \tag{9}$$
