*4.1. Numerical Simulation*

In this numerical simulation, 120 operating points were generated with MATLAB 2019A to simulate the characteristics of multiple working conditions and measurement error of industrial data. Particularly, two working conditions were generated with different centers and deviations (the deviations followed Gaussian distribution to simulate the measurement error in industry). In detail, every working condition consisted of 60 operating points, and the centers of working condition 1 and working condition 2 were set as (1, 1) and (−1, −1), respectively. In addition, standard deviations of the two working conditions were both set as 0.5. It should be noted that the operating points with larger deviation from their corresponding centers were considered as operating points with gross error, and they should be removed before the case reuse. Figure 3 shows the distribution of the generated dataset, which can perfectly reflect the characteristics of industrial data.

**Figure 3.** Distribution of the generated dataset.

As shown in Figure 3, the operating points lying on the edge of working condition 1 and working condition 2 were considered as operating points with gross error in this study. Moreover, the case solutions of working condition 1 and working condition 2 were designed as Equations (9) and (10), respectively.

$$\mathcal{Y}\_1(i) = 0.2(\mathbf{x}\_1(i) - 1)^2 + 0.3(\mathbf{x}\_2(i) - 1)^2 + (\mathbf{x}\_1(i) - 1) + 4 \tag{9}$$

$$\mathcal{Y}\_2(i) = -0.2(\mathbf{x}\_1(i) + 1)^2 + 0.5(\mathbf{x}\_2(i) + 1) - 4 \tag{10}$$

Their parameters were designed differently to reflect diverse operating experience in different working conditions. Furthermore, Equations (9)–(12) were designed as quadratic polynomials to represent the nonlinearity in the operating experience. For operating points with gross error, their measured descriptions were heavily deviated from their accurate descriptions. Consequently, their case solutions are less helpful for operational optimization than those of normal cases. For this reason, the case solutions of working condition 1 and working condition 2 with gross error were designed as Equations (11) and (12), respectively.

$$Y\_{1e}(i) = 0.2(\mathbf{x}\_1(i) - 1)^2 + 0.3(\mathbf{x}\_2(i) - 1)^2 + (\mathbf{x}\_1(i) - 1) + 8 \tag{11}$$

$$\text{Y}\_{2\varepsilon}(i) = -0.2(\text{x}\_1(i) + 1)^2 + 0.5(\text{x}\_2(i) + 1) - 8 \tag{12}$$

In this numerical simulation, 60 operating points were randomly chosen from the generated dataset as a case base, while the rest of 60 operating points were equally divided into two datasets. To be specific, the first was used as training dataset to pick out the optimal parameters including *k*, *m*, and *α*, and the last was chosen as a testing dataset to evaluate the performance of the designed abnormal case removal method with the selected optimal parameters. The concrete evaluation criterion was Mean Absolute Error (MAE).

$$MAE = \frac{\sum\_{i=1}^{n} \left| \mathbf{Y}\_i - \mathbf{Y}\_{i, \text{supgested}} \right|}{n} \tag{13}$$

where *n* is the number of cases in the testing dataset. *Yi* and *Yi*,*suggested* are the optimal solution and the suggested solution of the *i*th cases, respectively.

Since *k* is a crucial parameter for case retrieval and its value directly affects the performance of CBR, sensitivity analysis was firstly carried out to find the best parameter *k*. Figure 4 presents the MAE of the training dataset when *k* changed from 1 to 15.

**Figure 4.** MAE of the training dataset with different parameter *k*.

As shown in Figure 4, the tendency of MAE firstly decreases with *k* changing from 1 to 6, and then generally increases with *k* changing from 6 to 15. The minimal MAE was 0.1896 when the parameter *k* was chosen as 6. Therefore, the number of retrieved cases was set as 6 both in classic CBR and the improved CBR with the proposed abnormal case removal method. In addition, in order to find out the best parameters *m* and *α* for the abnormal case removal algorithm, orthogonal experiments were designed with the training dataset. In particular, the parameter *m* was set from 1 to 5 while the parameter *α* was set from 0.2 to 2.2. Table 2 shows the MAE of the training dataset with different combination of parameter *m* and parameter *α*.


**Table 2.** MAE of the training dataset with different parameter combination. Bold shows the optimal number.

As shown in Table 2, the minimal MAE of the training dataset was 0.1457 when the parameter *m* and *α* were set as 4 and 1, respectively. The reason as to why *m* and *α* could influence the MAE of the training dataset were analyzed as follows:


In the end, the best parameters of the designed abnormal case removal algorithm were set as *k*= 6, *m*= 4 and *α*= 1, respectively. With the aforementioned parameter combination, the testing dataset was finally used to show the effectiveness and the superiority of our method. Additionally, Cauchy fuzzy membership function was selected for the case-based fuzzy reasoning and its optimal parameters were 0.725 and 0.837, based on its performance against measuring error. The concrete fuzzy membership functions evaluation method and parameters optimization method can be found in reference [18]. Figure 5 presents the concrete results.

According to Figure 5, it can be found that the set values of our method are closer to their corresponding optimal set values than that of the other two methods. Specifically, there are overall five operating points (marked with red boxes) in which our method outperformed the classic CBR and case-based fuzzy reasoning. As an average, the abnormal case removal method improved the setting accuracy in the testing dataset by 20.3% compared with classic CBR, and by 8.5% compared with case-based fuzzy reasoning. The reason why our method can obtain better results is that some abnormal cases retrieved by the classic case retrieval step could be removed with Equations (5) and (7). By eliminating these abnormal cases whose LOFs are higher than the threshold, the impacts of these cases can be removed in the case reuse step, thus improving the quality of the retrieved cases. Naturally, the MAE of the testing dataset would be decreased, and the performance of operational optimization would be improved under the CBR framework.

**Figure 5.** Set values of the testing dataset for numerical simulation.

### *4.2. Operational Optimization of an Industrial Cut-Made Process of Cigarette Production*

In this case study, the designed abnormal case removal method was tested with industrial data collected from a cut-made process of cigarette production. In this production, the operator aims to keep the moisture content of leaf-silk close to the desirable value, and the operational optimality has an impact on the quality of cigarettes. Specifically, the studied cut-made process includes the following three procedures: (1) the leaf-silk drying procedure, (2) the blending procedure, and (3) the spicing procedure. Since many operating experiences were stored in the production data, the set value of the moisture content of the leaf-silk drying procedure could be determined with historical optimal cases. Table 3 presents the basic structure of historical cases for the operational optimization of cut-made process of cigarette production.


**Table 3.** Structure of historical case for the operational optimization of cut-made process.

After data preprocessing, a total of 200 cases were extracted for having valuable operating experience from the production data. Then, 100 cases were randomly chosen from the 200 cases as the case base, while the rest were equally divided into two datasets. The first was used as training dataset while the last was chosen as testing dataset. Similar to the numerical simulation, MAE was chosen to evaluate its operational optimization performance, and an orthogonal experiment was conducted to find the best parameter combination for the abnormal case removal algorithm and CBR. By trial and error, the best

parameters of the proposed abnormal case removal algorithm were set as *k* = 8, *m* = 5 and *α* = 0.6, based on which the operational optimization performance in the training dataset was improved by 22.3% compared with classic CBR. Furthermore, the Gaussian membership function was selected, and the optimized parameters were displayed in Table 4. Figure 6 exhibits the set values provided by these methods for the industrial cut-made process in the testing dataset.

**Table 4.** Optimized parameters of Gaussian membership function in the industrial case study.


**Figure 6.** Set values of the testing dataset for industrial cut-made process.

As shown in Figure 6, CBR with the designed abnormal case removal method (our method) can obtain better results in the operational optimization of moisture content of leaf-silk drying machine in production line A. In particular, overall, there are six operating points (marked with red boxes) in which our method outperformed the classic CBR and case-based fuzzy reasoning. This is due to some abnormal cases being removed by the proposed case removal method in the case retrieval step. Furthermore, the influence of multiple working conditions was not considered in the case-based fuzzy reasoning, and thus the performance of CBR with the designed abnormal case removal method was better. In summary, the MAE of classic CBR in testing dataset was 0.034 and the MAE of case-based fuzzy reasoning was 0.03, while the MAE of our method in the testing dataset was 0.026. The proposed abnormal case removal method improved the MAE by 23.5% compared to classic CBR, and by 13.3% compared to case-based fuzzy reasoning. Therefore, the effectiveness and the superiority of the local density-based abnormal case removal method was proven, and it is suitable for the operational optimization of industrial processes.
