**An Entropy Weight-Based Lower Confidence Bounding Optimization Approach for Engineering Product Design**

#### **Jiachang Qian 1,2, Jiaxiang Yi 1, Jinlan Zhang 2, Yuansheng Cheng 1,3 and Jun Liu 1,3,\***


Received: 15 January 2020; Accepted: 18 May 2020; Published: 21 May 2020

**Abstract:** The optimization design of engineering products involving computationally expensive simulation is usually a time-consuming or even prohibitive process. As a promising way to relieve computational burden, adaptive Kriging-based design optimization (AKBDO) methods have been widely adopted due to their excellent ability for global optimization under limited computational resource. In this paper, an entropy weight-based lower confidence bounding approach (EW-LCB) is developed to objectively make a trade-off between the global exploration and the local exploitation in the adaptive optimization process. In EW-LCB, entropy theory is used to measure the degree of the variation of the predicted value and variance of the Kriging model, respectively. Then, an entropy weight function is proposed to allocate the weights of exploration and exploitation objectively and adaptively based on the values of information entropy. Besides, an index factor is defined to avoid the sequential process falling into the local regions, which is associated with the frequencies of the current optimal solution. To demonstrate the effectiveness of the proposed EW- LCB method, several numerical examples with different dimensions and complexities and the lightweight optimization design problem of an underwater vehicle base are utilized. Results show that the proposed approach is competitive compared with state-of-the-art AKBDO methods considering accuracy, efficiency, and robustness.

**Keywords:** Kriging; lower confidence bounding; entropy theory; product design; simulation-based design optimization

#### **1. Introduction**

Computational simulation models, i.e., finite element analysis (FEA) and computational fluid dynamic (CFD) models, have been widely used in engineering design problems to replace physical experiments for reducing the time cost and shortening the product developing cycle. However, it is still computationally prohibited to solve engineering design optimization problems directly relying on simulation models, even though the storage capacity and computing efficiency of computers are maintaining rapid growth [1,2]. A popular strategy to address this limitation is to adopt surrogate models, also named the meta-model or approximate model, to replace the computational simulation model during the optimization process. There are several varieties of surrogate models, such as Polynomial response surface (PRS) model [3], Radial basis function (RBF) model [4,5], Kriging model [6–8], and Support vector regression (SVR) model [9,10]. Among these surrogate models, the Kriging model has been intensively used in engineering design optimization because it can provide not only the predicted value of an un-sampled point but also the predicted confidence interval of the predicted value.

The Kriging model-based design optimization methods can be divided into two types [11]: the off-line type and the on-line type. The off-line type uses all the computational resources to construct the final Kriging model in one time and the model does not update during the optimization process. Therefore, the determination of the sampling plan is crucial because the optimization process may fall into the local optimum region if the number of samples is too small [12]. On the contrary, it would waste computational burden if the number of samples is too large. To solve this dilemma, the on-line type has been developed, in which an initial Kriging model is built in the earlier stage and then new samples are added to update the Kriging model sequentially through certain criteria, e.g., maximum mean square error [13], cross validate error [14], etc., during the design optimization process. The online type can significantly reduce the computational burden compared with that of the off-line type because the information of the Kriging model is well utilized [15,16].

The on-line type Kriging model-based design optimization methods are also called adaptive Kriging-based design optimization (AKBDO). The target of AKBDO is to obtain the optimum using less computational cost [17,18]. At the same time, the balance between exploration and exploitation is important because it is critical for searching the global optimum. In detail, exploration means the ability of the algorithm to explore the whole design space for the latent optimal region. On the other hand, exploitation aims to identify the local area around the current optimum. Typically, there are several sorts of adaptive sampling approaches of AKBDO with different ways of making a trade-off between exploration and exploitation [19,20], such as the maximum-uncertainty adaptive sampling approaches [20], the efficient global optimization (EGO) methods [21], the lower confidence bounding (LCB) based methods [22], the aggregate-criteria adaptive sampling methods [19], and the multi-criteria adaptive approaches [23,24]. Among these approaches, the efficient global optimization method proposed by Jones [21] has been intensively adopted to handle realistic product design due to its high efficiency and ease of operation. In this work, the expected improvement (EI) function is introduced to quantify the improvement of an un-known point to the current best solution. The new point can be obtained by maximizing the EI function, and the Kriging model can be updated adaptively by adding the new point to the original sample set. The EI based EGO method has been intensively investigated in recent years [25–27]. For example, Xiao [28] proposed a weighted EI to make the balance between exploration and the exploitation more flexible; Zhan [29] proposed the EI matrix method to solve the multi-objective problem. Another famous AKDBO method is the LCB- based method [30]. The LCB function is an effective approach to balance exploration and exploitation by combining the predicted value and variance in a simple way [31]. Subsequently, a parameterized LCB (PLCB) method was proposed by using cool strategy to improve the ability to balance the exploration and exploitation of the original LCB [32]. Cheng et al. [33] considered the coefficient of variation of predicted values and variance to determine the weight factor adaptively during the sequential process. Further, some variants of the LCB methods focus on upper confidence bounding, such as, the Gaussian process upper confidence bounding (GPUCB) algorithm proposed by Srinivas et al., which considers the upper confidence bound of noisy functions [34]. The parallel type of the GPUCB was developed by Desautel et al. [35]. It mentions that LCB methods have been widely applied to solve real engineering problems [36,37]. However, the weight factor for balancing the exploration and the exploitation of the LCB based method remains an interesting problem. This is because most of the existing factor approaches are subjective or problem dependent, which is not robust in application for all cases.

In this paper, an entropy weight-based lower confidence bounding approach (EW-LCB) is proposed to ascertain the weight of the LCB function adaptively and objectively. In the proposed EW-LCB method, entropy theory is used to quantify the degree of variation of the predicted value and variance of the Kriging model. Then, a new weighted formula is introduced to allocate the weights of exploration and exploitation adaptively. To validate the performance of the proposed EW- LCB method, several numerical functions with different dimensions and complexities and an engineering problem are tested. The computational efficiency, accuracy of the optimum, and the robustness are considered when comparing EW-LCB with the existing famous AKBDO methods. Results showed that the performances of the proposed EW-LCB approach were competitive on the test cases.

The remainder of this paper is organized as follows, in Section 2, the basis of the Kriging model and several existing famous AKBDO methods are introduced. The details of the proposed approach with the assistance of an illustrative example are described in Section 3. In Section 4, the effectiveness of the proposed approach is tested on several numerical benchmark problems and an engineering design optimization problem. Finally, some conclusions and possible future works are proüposed in Section 5.

#### **2. Background**

#### *2.1. Kriging Model*

The Kriging model was originally proposed by Krige [38] to predict the location of a mine hole in a geostatistical community. Then, it was extended by Sacks et al. [6] for modeling an experiment of a computer. The Kriging model is also called the Gaussian process model, which is a kind of interpolative model. The Kriging model can be expressed as

$$
\hat{y}(\mathbf{x}) = \beta + Z(\mathbf{x}) \tag{1}
$$

where *<sup>x</sup>* represents the vector of the design variables, which is a *<sup>d</sup>*-dimensional vector *<sup>x</sup>* <sup>=</sup> {*x*1, *<sup>x</sup>*2, ... , *xd*}, β is an unknown parameter which denotes the global tendency, *Z*() is a static Gaussian process with zero mean and non-zero variance σ2, which represents the local deviation.

In the static Gaussian process, spatial correlation is used to organize the relationship between any two samples. Generally, the squared exponential function is utilized, which can be expressed as

$$R(\mathbf{x}\_{i}; \mathbf{x}\_{j}; \boldsymbol{\theta}) = \exp(-\sum\_{k=1}^{d} \theta\_{k} (\mathbf{x}\_{i}^{k} - \mathbf{x}\_{j}^{k})^{p\_{j}}) \tag{2}$$

where θ and *P* are the hyper-parameters used to control the smooth and the correlation between two sample points. Generally, the hyper-parameter vector *P* is set to be *pi* = 2; *i* = 1, 2, ... , *d* [39].

The core point of the modeling process of the Kriging model is to determine the unknown parameters. Because the responses obey the multivariable Gaussian distribution, the unknown parameter can be obtained by maximum likelihood estimation (MLE) [14]. The likelihood function can be organized as

$$L\{y(\mathbf{x}\_1), y(\mathbf{x}\_2), \dots, \mathbf{x}\_N\} \Big| \sigma\_r \boldsymbol{\beta}, \boldsymbol{\theta} \Big) = \frac{1}{\left(2\pi\sigma^2\right)^{\frac{N}{2}}} \exp\left[-\frac{\sum\_{i=1}^N \left(y(\mathbf{x}\_i) - \boldsymbol{\beta}\right)^2}{2\sigma^2}\right] \tag{3}$$

where *N* is the number of samples.

Then, Equation (3) can be simplified by taking the natural logarithm,

$$\ln(L) = -\frac{N}{2}\ln(2\pi) - \frac{N}{2}\ln(\sigma^2) - \frac{1}{2}\ln(|\mathbf{R}|) - \frac{(y-\mathbf{1}\boldsymbol{\beta})^T \mathbf{R}^{-1} (y-\mathbf{1}\boldsymbol{\beta})}{2\sigma^2} \tag{4}$$

where *y* is an *N*-dimensional vector that consists of the real responses, 1 is an *N*-dimensional vector that consists of 1,

The values of β and σ<sup>2</sup> can be obtained by setting the derivatives of Equation (4) concerning β and σ<sup>2</sup> to be 0,

$$\beta = \frac{f^T \mathbf{R}^{-1} y}{f^T \mathbf{R}^{-1} f} \tag{5}$$

$$
\hat{\boldsymbol{\sigma}}^2 = \frac{\left(\boldsymbol{y} - \mathbf{1}\boldsymbol{\beta}\right)^T \boldsymbol{\mathcal{R}}^{-1} \left(\boldsymbol{y} - \mathbf{1}\boldsymbol{\beta}\right)}{N} \tag{6}
$$

Then, substituting Equations (5) and (6) into Equation (4), and remove the constant terms, Equation (4) yields the concentrated ln-likelihood function

$$\ln(L) = -\frac{N}{2}\ln\left(\partial^2\right) - \frac{1}{2}\ln(|\mathcal{R}|) \tag{7}$$

It is difficult to obtain an analytical solution of θ because of high non-linearity and non-differentiality. Therefore, a numerical solution is obtained instead. The optimization algorithm, such as the genetic algorithm (GA) [40] and particle swarm optimization algorithm (PSO) [41], can be used to find the optimized values of θ.

The Kriging model is widely adopted in surrogate model-based engineering optimization because it can provide both the predicted value and variance [42]. The predicted value of an un- sampled point can be determined by minimizing the mean square error. Thus, the predicted value and variance can be expressed as

$$
\hat{y}(\mathbf{x}) = \boldsymbol{\beta} + \boldsymbol{r}(\mathbf{x})^T \mathbf{R}^{-1} \{\mathbf{y} - \mathbf{1}\boldsymbol{\beta}\} \tag{8}
$$

$$\boldsymbol{\delta}^2(\mathbf{x}) = \boldsymbol{\delta}^2 \left[ 1 - r(\mathbf{x})^\mathsf{T} \mathsf{R}^{-1} r(\mathbf{x}) + \frac{\left(\mathbf{1} - \mathbf{1}^\mathsf{T} \mathsf{R}^{-1} r(\mathbf{x})\right)^2}{\mathbf{1}^\mathsf{T} \mathsf{R}^{-1} \mathsf{1}} \right] \tag{9}$$

where *r*(*x*) is an *N*-dimensional vector representing the spatial correlation between the un-sample point and the sample points, which can be defined by

$$r(\mathbf{x}) = \{R(\mathbf{x}, \mathbf{x}\_1), R(\mathbf{x}, \mathbf{x}\_2), \dots, R(\mathbf{x}, \mathbf{x}\_N)\} \tag{10}$$

#### *2.2. Review of the Typical Adaptive Surrogate-Based Design Optimization Methods*

The goal of the AKBDO methods is to obtain the optimum with a limited computational budget. In this section, four popular AKBDO methods are briefly introduced.

#### 2.2.1. The Lower Confidence Bounding Method

With a concise expression, the LCB method is a popular AKBDO method, which can be expressed as

$$
abla(\mathbf{x}) = \mathfrak{Y}(\mathbf{x}) - b\mathfrak{s}(\mathbf{x})\tag{11}$$

where *y*ˆ(*x*) and *s*ˆ(*x*) are the predicted value and standard deviation, respectively. *b* is a factor utilized to control the weight between the *y*ˆ(*x*) and *s*ˆ(*x*) for the sake of balancing the exploration and the exploitation.

The goal of the LCB function is to identify the new sample points through the combination of predicted value and variance by Equation (11). The point with small predicted value or large uncertainty is chosen. Generally, a larger *b* means more emphasis on global exploration. On the contrary, with a small *b* value, the algorithm turns more attention to local exploitation. Cox and John reported that *b* = 2 and *b* = 2.5 can give a more efficient search [43].

#### 2.2.2. The Parameterized Lower Confidence Bounding Method

The weight factor in the LCB method is constant, indicating that the contributions of the predicted value and standard deviation will be fixed during the optimization process. Thus, the parameterized lower confidence bounding (PLCB) method is proposed [32], which can be defined by

$$
abla b(\mathbf{x}) = a\_i \mathfrak{d}(\mathbf{x}) - b\_i \mathfrak{k}(\mathbf{x}) \tag{12}$$

where a new parameter *ai* is developed to regulate the influence of the predicted value during the iteration process of design optimization. Meanwhile, the values of *ai* and *bi* vary during the iteration process, where *i* is the iteration order of the sequential process. In detail, the values of the parameters *ai* and *bi* can be expressed as

$$a\_{\bar{i}} = 1, b\_{\bar{i}} = \left( 1 + \cos\left(\frac{i\pi}{m}\right) \right) \Big/ \sin\left(\frac{i\pi}{m}\right) \tag{13}$$

where *m* is a parameter defined by the user, it is set to be *m* = 3 in Ref. [32].

According to Equation (12), the algorithm tends to focus on exploration when *bi*/*ai* has a larger value, while it tends to focus on exploitation when *bi*/*ai* has a respective small value. Specifically, the value of *bi*/*ai* in PLCB function has a larger value at the former iterations and has a relatively small value as the algorithm goes on. Consequently, the PLCB algorithm shows a better ability to balance the exploitation and the exploration when compared with the LCB method.

#### 2.2.3. The Expected Improvement Method

The expected improvement method is a famous AKBDO method proposed by Jones [21]. The expected improvement function can be defined to measure the latent improvement of an unknown point to the current optimum, which can be expressed as

$$I(\mathbf{x}) = \max(y\_{\min} - \mathbf{Y}(\mathbf{x}), 0) \tag{14}$$

The expected improvement can be formalized as

$$E(I(\mathbf{x})) = E(\max(y\_{\min} - \mathcal{Y}(\mathbf{x}), 0)) \tag{15}$$

which can be expanded into

$$E[l(\mathbf{x})] = (f\_{\min} - \mathfrak{H}(\mathbf{x})) \Phi(\frac{f\_{\min} - \mathfrak{H}(\mathbf{x})}{s(\mathbf{x})}) + \mathfrak{H}(\mathbf{x}) \phi(\frac{f\_{\min} - \mathfrak{H}(\mathbf{x})}{\mathfrak{H}(\mathbf{x})}) \tag{16}$$

where Φ and φ are the cumulative density function and probability density function of the standard normal distribution, respectively.

According to Equation (16), the first term mainly focuses on the exploitation and the second term primarily concerns the exploration. The point with the maximum value of the EI function is regarded as the new sample to update the Kriging model during the iteration process.

#### 2.2.4. The Weighted Expected Improvement Method

Although the EI method can balance the exploration and the exploitation, its efficiency is problem-dependent because the EI method provides a fixed compromise between the exploration and the exploitation. To address this issue, a weighted expected improvement method (WEI) [28] is developed, in which a tunable weight is adopted to adjust the contributions of exploration and exploitation. The WEI can be given by

$$E[I(\mathbf{x})] = w(f\_{\rm min} - \mathfrak{H}(\mathbf{x})) \Phi(\frac{f\_{\rm min} - \mathfrak{H}(\mathbf{x})}{s(\mathbf{x})}) + (1 - w)\mathfrak{H}(\mathbf{x}) \phi(\frac{f\_{\rm min} - \mathfrak{H}(\mathbf{x})}{\mathfrak{S}(\mathbf{x})}) \tag{17}$$

where *w* is the weight coefficient. The larger value of *w* indicates that the WEI will focus more on exploitation. Otherwise, the WEI method emphasizes exploration.

#### **3. Proposed Approach**

The goal of the proposed lower confidence bounding approach based on the entropy weight algorithm (EW-LCB) is to obtain an optimal solution with less computational burden through a sequential process. In EW-LCB, a new-weight factor is developed, which can allocate factors to balance global exploration and local exploitation by quantifying the degree of variation of the predicted value and variance from the Kriging model, respectively. In detail, the entropy theory is adopted to evaluate the relative discrepancy between the predicted value and uncertainty of the Kriging model. The framework of the EW-LCB is shown in Figure 1, which is composed of six steps.

**Figure 1.** The framework of the proposed entropy weight algorithm (EW-LCB).

To demonstrate the proposed EW-LCB approach more intuitively and detailed, a one- dimensional toy example is utilized. The test function is adopted from [33], which can be expressed as

$$y = 0.5\sin(4\pi\sin(x + 0.5)) + \frac{1}{3}(x + 0.5)^2; \quad x \in [0, 1] \tag{18}$$

The objective is to obtain the minimum value of Equation (18). Meanwhile, this function has a local optimal value *y* = −0.0445 at *x* = 0 and a global optimal value *y* = −0.1341 at *x* = 0.5312.

The details of the steps are elaborated as follows:

#### *3.1. Step 1: Generate the Initial Sample Set*

The generation of the initial sample set includes the determination of the number and location of the initial sample points, which is a crucial component of the AKBDO. If too few points are generated, the AKBDO can have a risk of falling into the local optimal because of the poor accuracy of the initial Kriging model. On the other side, it may be a waste of computational burden if too many initial samples are utilized, especially when dealing with costly engineering problems. For the tested cases, a state-of-the-art initial sample size rule *N* = 10 × *d* is used [21,44]. The sensitive analysis of the initial sample size is discussed in the next section. Besides, how to allocate the locations of the initial samples is another tricky issue. More uniformed distributed sample points are preferred because the initial Kriging model can obtain more information about the landscape of the real function. Therefore, the Latin Hypercube sampling (LHS) method [45] is used, which can guarantee that the samples distribute along each dimension uniformly.

Due to the simple landscape of the illustration example, the initial sample points are set to be *x* = [0, 0.5, 1], which is less than the recommended initial sample size. Herein, the responses of the initial sample points are *y* = [−0.0445, −0.1229, 0.7343], which are obtained by calculating the numerical function in Equation (18).

#### *3.2. Steps 2 and 3: Constructing the Kriging Model and Obtaining the Current Optimal Solution*

In Step 2, the Kriging model is established based on the initial sample set based on the DACE toolbox [46]. In detail, the regression function, the correlation function, and the initial value of θ are set to be '*Regpoly0*', '*Corrgauss*', and (10*d*) −1/*d* , respectively. Besides, all the codes are executed based on the computational platform with a 4.2 GHz Intel(R) Eight-Core (TM) i7-7700k Processor and 64 GB RAM. The initial Kriging model of the illustrated example is plotted in Figure 2 in which, the black line and blue dash line denote the real function and the initial Kriging model, respectively. Meanwhile, the initial sample points are marked with blue triangles.

**Figure 2.** The initial Kriging model of the illustrated example.

In Step 3, the current optimal value was obtained through a genetic algorithm [47], where the parameter setting is listed in Table 1.


**Table 1.** The parameter setting of the genetic algorithm.

The minimum value of the current responses is −0.1229, which is larger than the actual global optimal solution. Then, the current minimum value will be judged by the stopping criterion to decide whether the active-learning process goes on or not in the next step.

#### *3.3. Step 4: Check the Terminal Condition*

Generally, there are two common ways to stop the sequential process. That is (1) the difference between the current optimal solution and the actual one achieves at an acceptable level and (2) all the computational resources are used up. In this work, these stopping criteria are adopted for different scenarios. For the numerical functions, because the actual optimal solution is known, the stopping criterion can be associated with this value to test the effectiveness of the proposed approach. Therefore, the stop condition is defined as

$$\varepsilon\_{\tau} = \left| \frac{\min(y\_k(\mathbf{x}\_i)) - y^r}{y^r} \right| < \varepsilon\_{\mathcal{S}} \text{ } i = 1, 2, \dots, N \tag{19}$$

where min(*yk*(*xi*)) is the minimum value of the current sample set, *yr* is the actual optimal solution, ε*<sup>g</sup>* is a user-defined tolerance. Generally, the adaptive algorithm will confront the stricter test in the case of smaller tolerance. In this work, the value of ε*<sup>g</sup>* is defined as 0.002 referring to [32].

However, for the engineering cases, the above stopping criterion for the numerical problem is impractical because the engineering problem is always a black-box problem. Thus the value of the actual optimal solution is unknown. Therefore, the sequential updating process terminates when the maximum iteration is reached, which can be expressed as

$$k \ge K \tag{20}$$

where *k* and *K* denote the current iteration and the maximum iteration, respectively.

If the stopping criterion is satisfied, the sequential process will be terminated and the algorithm goes to Step 6. Otherwise, the proposed algorithm goes to Step 5 for a new iteration. In this illustrated example, the relative error between the current optimal solution and the actual one is 0.00084. Therefore, the sequential process goes to Step 5.

#### *3.4. Steps 5: Update the Sample Set through the Proposed EW-LCB*

To accelerate the adaptive optimization process, the lower confidence bounding function based on entropy theory is developed. Entropy theory was proposed by Shannon to quantify the degree of chaos in molecular motion [48,49]. In this work, it is developed to quantify the degree of variation of the predicted value and variance in the sequential optimization process. Herein, the proposed entropy weight method is an objective weighting method, which adaptively assigns weight to the LCB function according to the degree of variation of the predicted value and variances. Specifically, the entropy weight method consists of three major steps: normalize the values of the predicted value and variances, calculate the entropy value of the predicted value and variances, and determine the relative weight of them.

The EW-LCB function is defined as

$$EWMLCB(\mathbf{x}) = w\_1 \circ (\mathbf{x}) - w\_2 \circ (\mathbf{x}) \exp((-1)^r r) \tag{21}$$

where *y*ˆ(*x*),*s*ˆ(*x*) are the predicted value and estimated standard deviation of the tested point, respectively. *w*1, *w*<sup>2</sup> are the weights to reflect the contribution of the *y*ˆ(*x*),*s*ˆ(*x*), respectively. *r* represents the iterations of the current optimization solution, which can be used to avoid the proposed approach falling in the local optimal region.

To obtain the weights *w*1, *w*2, suppose that there are *N* samples with *m* indexes. The information of the samples can be normalized by

$$Y\_{ij} = \frac{X\_{ij} - \min\{X\_{1j}, X\_{2j}, \dots, X\_{Nj}\}}{\max\{X\_{1j}, X\_{2j}, \dots, X\_{Nj}\} - \min\{X\_{1j}, X\_{2j}, \dots, X\_{Nj}\}} \tag{22}$$

where *Xij* represents the *j th* index of the *i th* sample. Equation (22) is used to normalize the lower and upper bound. In this work, the value of *m* equals 2. Besides, the number of tested points is set to be 1000 to improve the robustness of the entropy weight method.

Then, the entropy value of each index can be determined by

$$E(p\_j) = -\frac{1}{\ln(N)} \sum\_{i=1}^{N} p\_{ij} \ln(p\_{ij}) \; j = 1, 2, \dots, m \tag{23}$$

where

$$p\_{ij} = \chi\_{ij} / \sum\_{i=1}^{N} \chi\_{ij} \tag{24}$$

If the value of *<sup>p</sup>ij* <sup>=</sup> 0, it indicates that the entropy of this tested point equals zero. In that case, a definition is given to compensate for the insufficiency of the initial assumption in Equation (23), which is defined as

$$\lim\_{p\_{ij}\to 0} p\_{ij} \ln(p\_{ij}) = 0 \tag{25}$$

According to Equation (23), the degree of variation of each indicator can be ascertained. The indicator with a larger value of information entropy has a smaller degree of variation. Subsequently, the corresponding entropy weight should be small. As such, the entropy weight can be obtained by

$$w\_{\dot{j}} = \frac{1 - E(P\_{\dot{j}})}{\sum\_{j=1}^{m} (1 - E(P\_{\dot{j}}))} \tag{26}$$

According to Equation (26), the weight of each index can be determined adaptively. Besides, *wj* ∈ [0, 1] and - *wj* = 1.

Here we give a brief explanation of the proposed EW-LCB criterion. The term *w*<sup>1</sup> *y*ˆ(*x*) is used for local exploitation, which concerns the optimal value. On the other side, the term *w*2*s*ˆ(*x*) exp((−1) *r r*) focuses on global exploration, which pays more attention to the uncertainty of the Kriging model for the potential global optimal region. If *w*<sup>1</sup> *w*<sup>2</sup> exp((−1) *r r*), it means the algorithm focuses more on global exploration. While *w*<sup>1</sup> *w*<sup>2</sup> exp((−1) *r r*) means the algorithm focuses more on local exploitation. The factor exp((−1) *r r*) serves as the catalyst to help the optimization process out of a local optimization solution. However, this factor may decrease the convergence speed of the proposed algorithm because the weight of the exploration will dominate *EWLCB*(*x*) when the current optimization solution is repeated too many times. Finally, the point with the minimum value of *EWLCB*(*x*) is selected as the new update point.

In this illustrated example, the weight parameters are *w*1= 0.4961,*w*<sup>2</sup> exp((−1) *r r*) = 0.5039 in the first iteration. It is shown that the algorithm focuses more on global exploration than local exploitation. Therefore, another sample point *x* = 0.4432 is added, and the corresponding Kriging model is refreshed, which is shown in Figure 3.

**Figure 3.** The first iteration of the proposed approach with the illustrated function.

#### *3.5. Step 6: Output the Optimal Solution*

Once the terminal conditions are achieved, the optimal solution will be the output. As shown in Figure 4, the optimal solution *x* = 0.5312 is obtained, which equals the global minimum.

**Figure 4.** The final result of the proposed approach on the illustrated function.

As shown in Figure 4, the trend of the actual function is recognized by the proposed approach and the optimal value can be obtained although the global accuracy of the Kriging model is not at a high level.

For comparison, four AKBDO methods, Expected improvement infill criterion (EI) [21], weighted expected improvement infill criterion (WEI) [28], Lower confidence bounding infill criterion (LCB) [22], and Parameterized lower confidence bounding (PLCB)[32], were tested on this case. To avoid the randomness of the LHS and GA, all the methods were repeated 100 times. The statistical results including the mean value and standard deviations of the function calls are summarized in Table 2.

**Table 2.** Comparison results of different approaches of the illustrated example.


As listed in Table 1, the average number of function calls of the proposed approach is less than those of the four AKBDO methods, indicating that the proposed EW-LCB approach performs better than the four AKBDO methods concerning efficiency. Besides, the standard deviation of the proposed EW-LCB approach is the smallest among all the methods, which means that the proposed approach has the best robustness among all the compared methods in this demonstration case.

#### **4. Tested Cases**

#### *4.1. Numerical Examples*

In this subsection, ten widely used benchmark problems from Ref. [33,50–52] are used to illustrate the effectiveness of the proposed EW-LCB method. Meanwhile, four famous AKBDO approaches, EI, WEI, LCB, and PLCB, are tested to compare with the EW-LCB method. As the optimal solutions for all benchmark problems can be obtained, the terminal condition is defined such that the relative

error between the optimal solution obtained from the Kriging model and the true one is within 0.002. Therefore, the number of iterations is regarded as the merit of effectiveness. Considering the randomness of the results, many AKBDO approaches repeat their algorithms dozens of times and provide statistical results [53,54]. Furthermore, some approaches use the deterministic sampling and optimization algorithms such as Hammersley and deterministic PSO [55–57] to avoid the randomness. In this work, each method ran 100 times with different initial samples and their statistical results are recorded in Table 3.


**Table 3.** The comparison of statistical optimization results.

The expressions of benchmark problems are listed as,

• Peaks function (PK)

$$f(\mathbf{x}) = 3(1 - \mathbf{x}\_1)^2 e^{-\mathbf{x}\_1^2 - (\mathbf{x}\_2 + 1)^2} - 10(\frac{\mathbf{x}\_1}{5} - \mathbf{x}\_1^3 - \mathbf{x}\_2^5) e^{-\mathbf{x}\_1^2 - \mathbf{x}\_2^2} - \frac{1}{3} e^{-\mathbf{x}\_2^2 - (\mathbf{x}\_1 + 1)^2}, \\ \mathbf{x}\_{1,2} \in [-3, 3] \tag{27}$$

• Banana function (BA)

$$f(\mathbf{x}) = 100(\mathbf{x}\_1^{\;2} - \mathbf{x}\_2)^2 + (1 - \mathbf{x}\_1)^2, \mathbf{x}\_{1,2} \in [-2, 2] \tag{28}$$

• Sasena function (SA)

$$f(\mathbf{x}) = 2 + 0.01(\mathbf{x}\_2 - \mathbf{x}\_1^2)^2 + (1 - \mathbf{x}\_1)^2 + 2(2 - \mathbf{x}\_2)^2 + 7\sin(0.5\mathbf{x}\_1)\sin(0.7\mathbf{x}\_1\mathbf{x}\_2) \text{ }, \mathbf{x}\_{1,2} \in [0, 5] \tag{29}$$

• Six-hump camp-back function (SC)

$$f(\mathbf{x}) = (4 - 2.1\mathbf{x}\_1^2 + \frac{\mathbf{x}\_1^4}{3})\mathbf{x}\_1^2 + \mathbf{x}\_1\mathbf{x}\_2 + (-4 + 4\mathbf{x}\_2^2)\mathbf{x}\_2^2, \mathbf{x}\_{1,2} \in [-2, 2] \tag{30}$$

*Appl. Sci.* **2020**, *10*, 3554

• Himmelblau function (HM)

$$f(\mathbf{x}) = \left(\mathbf{x}\_1^2 + \mathbf{x}\_2 - 11\right)^2 + \left(\mathbf{x}\_2^2 + \mathbf{x}\_1 - 7\right)^2, \mathbf{x}\_{1,2} \in [-10, 10] \tag{31}$$

• Goldstein–Price function (GP)

$$f(\mathbf{x}) = \begin{bmatrix} 1 + (\mathbf{x}\_1 + \mathbf{x}\_2 + 1)^2 (19 - 14\mathbf{x}\_1 + 3\mathbf{x}\_1^2 - 14\mathbf{x}\_2 + 6\mathbf{x}\_1\mathbf{x}\_2 + 3\mathbf{x}\_2^2) \end{bmatrix} \\ \times [30 + (2\mathbf{x}\_1 - 3\mathbf{x}\_2)^2 (18 - 32\mathbf{x}\_1 + 12\mathbf{x}\_1^2 + 48\mathbf{x}\_2 - 36\mathbf{x}\_1\mathbf{x}\_2 + 27\mathbf{x}\_2^2)], \\ \mathbf{x}\_{1,2} \in [-2, 2] \tag{32}$$

• Generalized polynomial function (GF)

$$\begin{aligned} f(\mathbf{x}) &= u\_1^2 + u\_2^2 + u\_3^2 \\ u\_i &= c\_i - \mathbf{x}\_1 (1 - \mathbf{x}\_2^i), i = 1, 2, 3 \\ c\_1 &= 1.5, c\_2 = 2.25, c\_3 = 2.625, \mathbf{x}\_{1,2} \in [-5, 5] \end{aligned} \tag{33}$$

• Levy 3 function (L3)

$$\begin{aligned} f(\mathbf{x}) &= \sin^2(\pi \omega\_1) + \sum\_{i=1}^2 (\omega\_i - 1)^2 [1 + 10 \sin^2(\pi \omega\_i + 1)] + (\omega\_3 - 1)^2 [1 + \sin^2(2\pi \omega\_3)] \\ \omega\_i &= 1 + \frac{\mathbf{x}\_i - 1}{4}, i = 1, 2, 3, \mathbf{x}\_i \in [-10, 10] \end{aligned} \tag{34}$$

• Hartmann 3 function (H3)

$$\begin{cases} f(\mathbf{x}) = -\sum\_{i=1}^{4} \alpha\_i \exp\left[ -\sum\_{j=1}^{3} \beta\_{ij} (\mathbf{x}\_j - p\_{ij})^2 \right] \\ 0 \le \mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_3 \le 1 \end{cases} \tag{35}$$

where

$$
\boldsymbol{a} = \begin{bmatrix} 1 \\ 1.2 \\ 3 \\ 3.2 \end{bmatrix} \boldsymbol{\beta} = \begin{bmatrix} 3.0 & 10 & 30 \\ 0.1 & 10 & 35 \\ 3.0 & 10 & 30 \\ 0.1 & 10 & 35 \end{bmatrix} \boldsymbol{P} = \begin{bmatrix} 0.3689 & 0.1170 & 0.2673 \\ 0.4699 & 0.4387 & 0.7470 \\ 0.1091 & 0.8732 & 0.5547 \\ 0.0381 & 0.5743 & 0.8828 \end{bmatrix} \tag{36}
$$

• Leon (LE)

$$f(\mathbf{x}) = 100(\mathbf{x}\_2 - \mathbf{x}\_1^3)^2 + (\mathbf{x}\_1 - \mathbf{1})^2; \mathbf{x}\_1, \mathbf{x}\_2 \in [-10, 10] \tag{37}$$

The statistical results of 100 times for five AKBDO approaches are summarized in Table 3. In Table 3, the FEmean represents the mean of iteration times illustrating the efficiency of the method, while FEstd denotes the variance of function evaluations, which can reflect the robustness of each method [58]. In Table 3, the numbers after the mean or standard deviation are the rank of the compared method for each numerical case. For example, 26.97/1 means the mean value is 26.97 while the method ranks first. The numbers marked in bold represent the first rank among the five AKBDO approaches. It can be inferred that the EW-LCB ranks first in most of the test problems, which indicates that the proposed EW-LCB outperforms the other compared approaches considering effectiveness. To further demonstrate the robustness of the proposed approach, Figure 5 plots the box plot of FEmean of all 100 runs.

**Figure 5.** The box plot of the FEmean with different methods.

Table 4 shows the average ranking of the performance of five AKBDO for all the tests. The average ranking of EW-LCB is the best among the five approaches. It is then followed with PLCB, LCB, EI, and WEI. When it comes to the robustness of the compared approaches, the proposed EW- LCB performs better than the PLCB, LCB, and WEI methods, while it is slightly inferior to the EI method. To evaluate whether the differences between the proposed EW-LCB method and the other four approaches are significant or not, the *p* values over multiple test cases are obtained by using the Bergmann–Hommel procedure [59]. The statistic test results are listed in Table 5. As shown in Table 5, all the *pi*-values are less than 0.05, indicating that there are significant differences in the efficiency performance between the proposed EW-LCB and the other four approaches.


**Table 5.** The *p*-values obtained in the numerical examples by the difference significance test.

**Table 4.** Average ranking results for all AKBDO approaches considering all numerical cases.


To demonstrate the influences of initial sample sizes, the other two initial sample sizes were studied. The initial sample points are all generated by the LHD method, and function SA and L3 were selected as function SA needs a small sample size, while function L3 needs a large sample size. Table 6 shows the results of comparisons. The numbers after '/' represent the efficiency metric ranking of each method. It is easy to see that the initial sample sizes have a great influence on the function of SA. For function SA, EW-LCB always ranks first, while the ranks of EI, WEI, LCB, PLCB change a lot. For instance, PLCB ranks second when the initial sample sizes are 5*d* and 15*d*, while when it comes to the size of 10*d*, the rank changes to fourth. As for function L3, the numbers of the sample size of EI, WEI, LCB change little, while the numbers of the sample size of PLCB and EW- LCB change significantly when the initial sample size transforms from 5*d* to 10*d*. This represents that PLCB and EW-LCB may perform well with a small sample size in the case of a quite complex function while EI, WEI, and LCB only represent this feature when the function is simple. This is attributed to the objective weighting factors in PLCB and EW-LCB, which are able to allocate factors to balance global exploration and local exploitation. In summary, the EW-LCB method shows the greater ability in balancing between global exploration and local exploitation compared to the other four AKBDO methods.

**Table 6.** Results of different initial sample points for functions SA and GF.


#### *4.2. Engineering Application*

In this section, an underwater vehicle base design problem is utilized to verify the effectiveness of the proposed method. The base is a braced structure for vibration devices in the hull of an underwater vehicle. The main capability of the base provides a platform for the installation of some imported vibration equipment and avoids the vibration transmitting to the hull directly. Meanwhile, the mechanical impedance of the base has a determination effect in reducing the level of noise. Specifically, the mechanical impedance is expected at a high level under all computational frequencies. The structural profile of the base adjoined to the hull of the underwater vehicle is depicted in Figure 6. The fixed structural and material parameters of the cylindrical shell and the base are listed in Table 7.

**Figure 6.** The structural profile of the base.

**Table 7.** The values of the fixed structural and material parameters.


In this work, the objective is to maximize the minimum difference of the impedance between the scheme in design and the required impedance value under the same frequency. Simultaneously, the weight of the optimized scheme should be less than that of the allowable one. Therefore, the optimization problem can be described as,

$$\begin{array}{c} \text{find } \mathbf{x} = [\mathbf{x}\_1, \dots, \mathbf{x}\_6] \\ \max f(\mathbf{x}) = \min \{ I(\mathbf{x}, \boldsymbol{\omega}\_i) - I(\mathbf{x}\_0, \boldsymbol{\omega}\_i); i = 1, 2, \dots, k \} \\ \text{s.t. } \ g(\mathbf{x}) = \mathcal{W}(\mathbf{x}) - \mathcal{W}(\mathbf{x}\_0) < 0 \end{array} \tag{38}$$

where *x* represents the vector of the design variables, which is a six-dimensional vector.ω*<sup>i</sup>* is the *i* th computational frequency. In detail, the design variables of this problem are the thickness of the panels of the base. Figure 7 plots the schematic diagram and Table 8 lists the meanings and value space of the design variables, respectively. *I*(*x*,ω*i*) and *W*(*x*) represent the impedance value under a specific computational frequency and the weight of the base, respectively.*I*(*x*, ω*i*) and *W*(*x*0) are the required impedance value at the *i* th computational frequency and the allowable base weight, respectively.

**Figure 7.** The design variables and loads of the underwater vehicle base.


**Table 8.** The meaning and ranges of the design variables.

Generally, the responses of the impedance curve are obtained through the finite element simulation software ANSYS 18.1. The computational platform is with a 4.01 GHz Intel(R) Eight-Core (TM) i9-9900ks Processor and 64 GB RAM. In this simulation, the boundary condition is fixed for all the translation degrees at both sides of the shell. The loading is a unit vertical force at point *A* as depicted in Figure 7. The ribs are simulated by the Beam 188 element and the rest of the model is simulated by the Shell 181 element. The number of elements has to be more than 34,000 to get a desirable accuracy of the impedance value, as shown in Figure 8. Then, the frequency step is set to be 2.5 Hz and the computational frequency ranges from 0 to 350 Hz. To improve the efficiency of the optimization, minimal convex polygon technology is adopted to pre-process the impedance curve. In that case, the global feature of the curve and the minimum impedance value of the impedance curve are preserved. However, these complex and multimodal features, which may disturb the convergence speed of the optimization process, are filtered. Figure 9 illustrates the impedance curves before and after pre-processing on the scheme *x* =[60, 60, 30, 30, 24, 24].

**Figure 8.** The mesh model of the base with the shell of the underwater vehicle.

˖

**Figure 9.** The impedance curves before and after pre-processing.

As shown in Figure 9, the red line denotes the real impedance curve, which fluctuates significantly under different frequencies. The black line is the impedance curve after the pre- processing operation, which is smooth and the minimum value of the original curve remains the same. The blue line is the required impedance curve under different frequencies. Moreover, the allowance weight for this case is 3.027t and the maximum iteration of this case is set to be 50. To eliminate the randomness of the initial samples and the genetic algorithm, all methods use the same 60 initial LHS samples and the same setups of the genetic algorithm. Moreover, all the methods are repeated 20 times to avoid randomness occurring even though the setups are the same. The statistical optimization results of this problem with all compared methods of under 20 runs are summarized in Table 9. Furthermore, the detailed design variables, optimal, and weights of all runs are listed in Table A1 in the Appendix A.


**Table 9.** The statistical optimization results of the engineering case with different methods.

As illustrated in Table 9, the best value of the proposed method is 4.062 <sup>×</sup> <sup>10</sup>5*Nm*/*s*, which is the maximum optimal value among all methods. Moreover, the proposed method has the maximum mean value on the objective among all the listed methods. It indicates the effectiveness of the proposed approach. Regarding robustness, the proposed method also performs best among all these methods because the proposed method obtains the minimum standard deviation. It is worth mentioning that the LCB and PLCB methods obtain some infeasible solutions. In detail, there are 15 and 9 runs that have failed for the LCB and PLCB methods respectively. The results show that the proposed method is a stable and effective method to solve this engineering optimization problem. Figure 10 shows the impedance curves of the optimal scheme of the proposed approach and the original scheme. As illustrated in Figure 10, the impedance curve of the optimal scheme is better than that of the

original scheme because the impedance curve has larger impedance values in the frequencies which are critical to the performance of the base as shown in the sub-figure of Figure 9. On the other side, in the frequencies which are not critical to the performance of the base, the optimal scheme has smaller values than those of the original scheme. Therefore, the proposed approach is an effective method for this engineering case.

**Figure 10.** The impedance curve of the optimization solution.

#### **5. Conclusions**

To balance exploration and exploitation during the sequential process of the AKBDO, an EW-LCB approach was developed in this work to obtain an optimal solution with less computational resources. In the proposed EW-LCB approach, entropy theory was adopted to quantify the degree of variation of the predicted value and variance of the Kriging model, respectively. Then, the weights were assigned to the LCB function automatically according to the relative values of the entropy theory. Meanwhile, an index factor was defined, which changed with iterations of the appearance of the current optimum, to avoid the sequential process being lost in the local optimum. The updated point was generated by minimizing the EW-LCB function, and the Kriging model updated sequentially. To test the performance of the proposed EW-LCB methods, four typical AKBDO methods including EI, WEI, LCB, and PLCB were adopted for comparison on ten widely used benchmark numerical functions and an engineering case. Results show that the proposed EW-LCB approach can obtain the optimum with the desired accuracy using less computational burden. Moreover, the proposed method has competitive robustness compared with state-of-the-art methods.

It is of note that the proposed method can handle constrained optimization problems by transferring the constrained optimization to the unconstrained one using the penalty methods. In practical engineering cases, simulation models with different fidelities always are available, as part of our future work, the developed EW-LCB method will be extended to the multi-fidelity scenario.

**Author Contributions:** Conceptualization, J.Q. and J.Y.; methodology, J.Q.; software, J.Q., validation, J.Y. and J.Q.; formal analysis, J.Q.; investigation, J.Q.; resources, J.Q.; data curation, J.Q.; writing—original draft preparation, J.Q.; writing—review and editing, J.Q., J.Y., J.Z., Y.C., and J.L.; visualization, J.Q.; supervision, J.Z., Y.C., and J.L.; funding acquisition, Y.C., and J.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Research Funds of the Maritime Defense Technologies Innovation.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**


**Table A1.** Detailed optimization results of the engineering case with different methods.


**Table A1.** *Cont*.

Note: the weights which are larger than the allowable ones are marked by red.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
