**3. The BDCDO Solving Framework**

The BDCDO solving framework based on the surrogate model of the derivative function mainly consists of two parts: refining the surrogate model for the right hand-side function of the state equation and solving the black-box dynamic co-design and optimization (BDCDO) problem based on the surrogate model. In the loop of approximating the derivative function, the initial model is built according to the initial samples set and then updated by the sequential sampling method. In the loop of solving the BDCDO, the BDCDO is discretized into NLP at time grid nodes, then the approximate Jacobian information matrices based on the model are delivered for the SQP algorithm to solve the NLP. For the purpose of efficiently constructing the surrogate model of the derivative function, the new sampling method based on the successive relative improvement ratio of the discrete trajectory points and maximize the distances between the sample points, termed SRIRMD, is elaborated in this section. Meanwhile, to quantify the convergence and intuitively reflect the convergence trend of the solution, a new termination criterion on the basis of the fact that the state trajectories tends to coincide during the solving process, called the state trajectory overlap ratio (STOR), is also introduced in this section. Finally, the newly proposed sampling strategy, SRIRMD, and termination criterion, STOR, are integrated into the solving framework of BDCDO.

#### *3.1. Adaptive Sequential Sampling Strategy*

In the static optimization problem, the optimal result is a point; hence, single sampling strategies that focus on improving the accuracy of the surrogate model near the current optimal point are widely used, while batch sampling strategies are evidenced to be unable to enhance the performance of updating model [29]. However, the scenario is quite different in the dynamic optimization problem since the optimal results are time-dependent control curves and the corresponding state trajectories. Adding a new sample point during each iteration contributes less to improving the convergence rate of the state trajectory but instead increases the number of modeling and optimization throughout the solution process and consumes more computational resources.

Motivated by the sequential sampling strategy in the static optimization problem, the adaptive sampling strategy applied to the DOP should concentrate on how to improve the accuracy of the local region where the current state trajectory is located in order to avoid redundant sampling in the uninteresting area. To this end, it is necessary to pick informative points in the vicinity of the current trajectory to update the surrogate model of the derivative function. Fortunately, the discrete trajectory points (DTPs) on the current state trajectory offer a large number of candidate samples. However, it is not wise to add all DTPs to the samples set, because (a) the expensive evaluation of all DTPs requires a lot of computational effort, and (b) closer DTPs are less helpful to improving the accuracy but increase the complexity of the model. Hence, a new sequential sampling strategy based on the successive relative improvement ratios of the DTPs and maximizing the distances between the sample points, called SRIRMD, is presented in this work. The SRIRMD sampling strategy prioritizes picking the points with large successive relative improvement ratios among the DTPs as new samples to refine the surrogate model. Meanwhile, maximizing the minimum distances between new samples and existing samples can guarantee that all sample points are distributed uniformly. The specific steps of the SRIRMD method are listed below.

**Step 1:** Obtain the initial values *VI* and the optimal values *VO* at the discrete points of the current state trajectory and calculate the successive relative improvement ratios (srir) of all discrete points.

$$SRIR = \left\{ srir \middle| srir = \left| V\_O^i - V\_I^i \right| / \left| V\_O^i \right|, i = 1, 2, \dots, m \right\} \tag{10}$$

where *m* is the number of discrete points of the current trajectory.

**Step 2:** Generate the candidate samples set *SSRIR* by eliminating the points whose srir is smaller than the allowable deviation factor *β* from the DTPs.

$$S\_{SRIR} = \{ DTPi | sriri \ge \mathcal{J}, sriri \in SRIR \} \tag{11}$$

**Step 3:** Calculate the distance matrix **D**min of all candidate samples in *SSRIR* to the nearest point in the existing samples set *S*(*l*).

**Step 4:** Select new sample *xnew* by the following sampling criterion.

$$\max \operatorname{svir}(\mathbf{x}\_{new}) \cdot \mathbf{D}\_{\min}(\mathbf{x}\_{new}) \tag{12}$$

**Step 5:** Add the new sample *xnew* into samples set *S*(*l*) and delete *xnew* in *SSRIR*.

**Step 6:** Repeat Step 3–5 until the required number of samples are picked, and the samples set is updated to *S*(*l*+1).

**Step 7:** Terminate the current round of sampling.

Figure 1 exhibits the process of selecting new sample points from the DTPs employing the SRIRMD strategy. The dotted lines are the previous state trajectories, the dots are the discrete points of those previous state trajectories, and the red solid lines are the current optimal state trajectories. The new points sampled by SRIRMD method are plotted in the form of black stars, and these samples are mainly distributed in the areas with large successive relative improvement ratios while being evenly spread.

## *3.2. Termination Criterion: The State Trajectory Overlap Ratio*

The state trajectory overlap ratio, STOR, is designed to intuitively assess the convergence of state trajectories and serve as the termination criterion. The formula for STOR is expressed as follows:

$$\mathbf{A} = \prod\_{i=1}^{d} \boldsymbol{\alpha}\_{i} \tag{13}$$

where *d* is dimension of state variables. *α<sup>i</sup>* means the state component trajectory overlap ratio, which reflects the trajectory overlap ratio of the *ith* state component in two successive iterations. As shown in Figure 2, *S*<sup>∗</sup> is the trajectory of the state component *ξ<sup>i</sup>* obtained in the previous iteration, and *S*∗∗ is the trajectory of *ξ<sup>i</sup>* in the current iteration. The *α<sup>i</sup>* of the

state component *<sup>ξ</sup><sup>i</sup>* can be calculated by the lengths of -<sup>A</sup>*<sup>B</sup>* and -A*C*, and the specific formula is expressed in Equation (14).

$$\alpha\_{i} = \frac{L\_{\widehat{\rm AB}}}{L\_{\widehat{\rm AC}}} \tag{14}$$

**Figure 2.** Schematic diagram of the state component trajectory overlap ratio.

Equation (14) denotes the formula for calculating the *α<sup>i</sup>* in the state space. However, the lengths of the trajectories - <sup>A</sup>*<sup>B</sup>* and - A*C* are not convenient to calculate in the actual computation procedure. To facilitate the calculation, Equation (14) can be mapped from the state space to the time domain since the state trajectories and time points correspond (i.e., each time point corresponds to a specific trajectory value). Hence, Equation (14) is replaced by Equation (15) to compute *αi*.

$$\alpha\_{i} = \frac{L\_{\widehat{\rm AB}}}{L\_{\widehat{\rm AC}}} \tag{15}$$

where *T*<sup>0</sup> and *T*<sup>1</sup> are represent the overlap and non-overlap periods of trajectories *S*∗ and *S*∗∗ in the time domain *T*, respectively. *T*<sup>0</sup> and *T*<sup>1</sup> can be computed by the following expression.

$$\begin{cases} T\_0 = \left\{ t \left| \frac{\dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*\*}(t), \mathbf{\mathcal{E}}^{\*\*}(t), \mathbf{u}^{\*\*}(t); \mathbf{\mathcal{S}}M^{\*\*}) - \dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*}(t), \mathbf{\mathcal{E}}^{\*}(t), \mathbf{u}^{\*}(t); \mathbf{\mathcal{S}}M^{\*\*})}{\left| \dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*\*}(t), \mathbf{\mathcal{E}}^{\*\*}(t), \mathbf{u}^{\*\*}(t); \mathbf{\mathcal{S}}M^{\*\*}) \right|} > \beta, t \in T \right\} \\ T\_1 = \left\{ t \left| \frac{\dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*\*}(t), \mathbf{\mathcal{E}}^{\*\*}(t), \mathbf{u}^{\*\*}(t); \mathbf{\mathcal{S}}M^{\*\*}) - \dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*}(t), \mathbf{\mathcal{E}}^{\*}(t), \mathbf{u}^{\*}(t); \mathbf{\mathcal{S}}M^{\*\*})}{\left| \dot{\bar{\boldsymbol{\xi}}}\_i(\mathbf{x}\_p^{\*\*}(t), \mathbf{\mathcal{E}}^{\*\*}(t), \mathbf{u}^{\*\*}(t); \mathbf{\mathcal{S}}M^{\*\*}) \right|} \leq \beta, t \in T \right\} \end{cases} \tag{16}$$

where *SM*∗ and *SM*∗∗ are surrogate models of derivative function in the previous and current iterations. *β* is the allowable deviation factor of the trajectory, which indicates trajectories *S*∗ and *S*∗∗ are also regarded as coincide when their error rate at moment *t* are not greater than *β*. It is worth noting that *β* = 0.01 in this research. The time domain *T* is a continuous time series, and it could be uniformly discretized as *T* = *t*0, *t*1,..., *tτ*,..., *tf* to accelerate computations. Thus, Equation (16) is converted into Equation (17).

$$\begin{split} T\_{0}^{\prime} &= \left\{ t\_{\mathsf{r}} \Big| \frac{\left| \dot{\bar{\boldsymbol{\varepsilon}}}\_{i} (\mathbf{x}\_{p}^{\prime\prime}(t), \mathbf{\mathcal{L}}^{\prime\prime}(t\_{\mathsf{r}}), \mathbf{u}^{\prime\prime}(t\_{\mathsf{r}}); \mathbf{\mathcal{M}}^{\prime\prime}) - \dot{\bar{\boldsymbol{\varepsilon}}}\_{i} (\mathbf{x}\_{p}^{\prime}(t), \mathbf{\mathcal{L}}^{\prime}(t\_{\mathsf{r}}), \mathbf{u}^{\prime}(t\_{\mathsf{r}}); \mathbf{\mathcal{M}}^{\prime}) \right| \Big| \mathbf{\mathcal{S}}\_{i} \,\,\mathbf{r} = 0, 1, 2, \ldots, f \right\} \\ T\_{1}^{\prime} &= \left\{ t\_{\mathsf{r}} \Big| \frac{\left| \dot{\bar{\boldsymbol{\varepsilon}}}\_{i} (\mathbf{x}\_{p}^{\prime\prime}(t), \mathbf{\mathcal{L}}^{\prime\prime}(t\_{\mathsf{r}}), \mathbf{u}^{\prime\prime}(t\_{\mathsf{r}}); \mathbf{\mathcal{M}}^{\prime\prime}) - \dot{\bar{\boldsymbol{\varepsilon}}}\_{i} (\mathbf{x}\_{p}^{\prime}(t), \mathbf{\mathcal{L}}^{\prime}(t\_{\mathsf{r}}), \mathbf{u}^{\prime}(t\_{\mathsf{r}}); \mathbf{\mathcal{M}}^{\prime\prime}) \right| \right. \\ & \left. \left. \left| \dot{\bar{\boldsymbol{\varepsilon}}}\_{i} (\mathbf{x}\_{p}^{\prime\prime}(t), \mathbf{\mathcal{L}}^{\prime\prime}(t\_{\mathsf{r}}), \mathbf{u}^{\prime\prime}(t\_{\mathsf{r}}); \mathbf{\mathcal{M}}^{\prime\prime\prime}) \right| \right. \end{split} \tag{17}$$

Furthermore, Equation (14) for calculating *α<sup>i</sup>* is transformed into Equation (18).

$$\alpha\_i = \frac{n\_{T\_1'}}{n\_{T\_0'} + n\_{T\_1'}} \tag{18}$$

where *nT* <sup>0</sup> and *nT* <sup>1</sup> denote the number of elements in the sets *<sup>T</sup>* <sup>0</sup> and *T* <sup>1</sup>, respectively.

According to the above series of transformations, including mapping and discretization, the calculation of *α<sup>i</sup>* is finally transformed into the statistics of the elements in the sets. Obviously, the denser the time domain *T* is divided in Equation (17), the more accurate *α<sup>i</sup>* is in Equation (18). At the same time, the following two theorems are derived on the basis of the above definition about A and *αi*.

**Theorem 1.** *The sufficient condition for the convergence of the state trajectory overlap ratio* A *is that all the state component trajectory overlap ratio α<sup>i</sup> converge*.

**Proof:** From the definitions of A and *αi*, it follows that A ∈ [0, 1], *α<sup>i</sup>* ∈ [0, 1]. From the perspective of the mathematics, convergences of A and *α<sup>i</sup>* implies that the values of A and *α<sup>i</sup>* converge to 1, i.e.,

$$\text{A and } \mathfrak{a}\_i \text{ converges } \lhd = \succ \text{ A } \to \text{1}, \mathfrak{a}\_i \to 1 \tag{19}$$


$$\mathbf{A}^k = \prod\_{i=1,2,\ldots,k-1,k+1,\ldots,d} \alpha\_i \tag{20}$$

When *k* = *d*,

$$\mathbf{A} = \mathbf{A}^{d-1} \cdot \boldsymbol{\alpha}\_d \tag{21}$$

where <sup>A</sup>*d*−<sup>1</sup> <sup>∈</sup> [0, 1] and *<sup>α</sup><sup>d</sup>* <sup>∈</sup> [0, 1]. Without considering the specific value of <sup>A</sup>*d*−1, only *α<sup>d</sup>* → 1, then A may converge to 1. Conversely, if *α<sup>d</sup>* → 1, then A definitely does not converge to 1 whether or not <sup>A</sup>*d*−<sup>1</sup> converges to 1. Similarly, let *<sup>k</sup>* be *<sup>d</sup>* <sup>−</sup> 2, *<sup>d</sup>* <sup>−</sup> 3, .., 2, respectively, so that the sufficient condition for A → 1 is *α<sup>i</sup>* → 1, *i* = 1, 2, .., *d* . That means the sufficient condition for the convergence of A is that all *α<sup>i</sup>* converge. -

**Theorem 2.** *When the state trajectory overlap ratio* A *converges, every state component trajectory overlap ratio α<sup>i</sup> is greater than or equal to* A.

## **Proof:**


$$
\alpha\_k = \frac{\mathbf{A}}{\mathbf{A}^k} \tag{22}
$$

Suppose <sup>A</sup> is converged, then <sup>A</sup>*<sup>k</sup>* is also converged (i.e., <sup>A</sup>*<sup>k</sup>* <sup>→</sup> 1), since <sup>A</sup>*<sup>k</sup>* <sup>⊂</sup> A. Obviously, *α<sup>k</sup>* ≥ A can be deduced from Equation (22). Similarly, let *k* be 1, 2, ... , *d*, respectively, then *α<sup>i</sup>* ≥ A, *i* = 1, 2, ... , *d* can be proved. That means when A converges, all *α<sup>i</sup>* are greater than or equal to A.

According to the definition of the state trajectory overlap ratio A and the proving procedures of the above two theorems, it could be concluded that taking A as the convergence criterion for the BDCDO solving framework has the following advantages: (a) directly reflects the convergence trend of state trajectory in the iterative solving process, and A ∈ [0, 1], which also quantifies the convergence; and (b) guarantees the convergences of state component trajectories. The state solution of the BDCDO converges only if all the state component trajectories converge. -
