4.2. Evaluation Indicators and Test Functions
In this paper, we used three indicators to evaluate our model performance. We used Root Mean Square Error (RMSE) [
36] to assess the overall accuracy of the SSTP models. To evaluate the convergence and uniformity of the parameter optimization algorithm, we used Generational Distance (GD) [
37] and Spacing (SP) [
38] as the indicators, and introduced three test functions. The details of these indicators and test functions are described in the following.
RMSE is a measure of the deviation between the observed value and the true value. The smaller the value, the more accurate the prediction. The formula of RMSE is as follows:
where
is the true value,
is the predicted value, and n is the number of days to predict.
The Pareto optimal solution set obtained by the multi-objective optimization algorithm should maintain the convergence of the solution and the uniformity of the distribution. In order to evaluate the convergence and uniformity of the Pareto frontier obtained by the multi-objective optimization algorithm, to obtain as many Pareto optimal solutions as possible, and to approach the true Pareto frontier as best as possible, GD was used as the solution convergence performance evaluation. The smaller the value of GD, the better the convergence of the solution set. Secondly, the Pareto optimal solution should be evenly distributed along the Pareto frontier as much as possible, and SP was used as the index of uniform distribution performance evaluation. The smaller the SP, the more uniform the solution set distribution. GD is defined as:
where n is the optimal number of Pareto solutions, and
is the distance of the i-th Pareto optimal solution in the objective space from the nearest individual of the Pareto frontier. SP is defined as:
where n is the optimal solution number of Pareto,
is the distance of the i-th Pareto optimal solution from other individuals in the objective space, and
is the average value of
.
LSPSO was applied to the bi-objective optimization of accuracy and efficiency of the SSTP. In order to verify the feasibility of LSPSO in the bi-objective optimization problem, three commonly used bi-objective test functions were selected for testing: BNH [
39], SRN [
40] and TNK [
41]. We compared the GD and SP indicators of the optimal frontier obtained by MODE, NSGA-II, and OMOPSO, respectively.
Table 1 lists the characteristics of these three test functions.
4.3. Analysis of Experimental Results
In this work we designed three sets of experiments: (1) to compare the advantages and disadvantages of the similarity measures in SSTP and choose the best method; (2) to verify the effectiveness of LSPSO by comparing it with the MODE, NSGA-II, and OMOPSO algorithms; (3) to compare the performance of DSL with other SSTP methods.
Experiment 1: Comparison of similarity measures. Applying AC to SSTP requires choosing a suitable similarity measure method to measure the similarity of SST sequences. Therefore, we first computed the Euclidean distance, the cosine distance, and the DTW distance to measure the similarity of SSTs. Then, we chose the optimal similarity measure based on this principle: the better the similarity is measured, the smaller the SSTP error is. The error of SSTP was measured by Root Mean Square Error (RMSE).
Table 2 shows the achieved RMSE in relation to SST predictions based, respectively, on Euclidean distance, Cosine distance, and DTW distance similarity measures. The first nine rows provide RMSE values for nine different SST sequences, with the average values given in the last row. When using the Euclidean distance to predict SST, the average RMSE is 0.4949, which is slightly higher than (at times comparable to) DTW, but much better than Cosine. The reason for this is that the number of days in which the SST changes regularly is not fixed, and the Euclidean distance does not support scaling. However, the DTW distance can better overcome this deficiency, so DTW can better reflect the similarity of SST changes.
The average RMSE when using the cosine distance to predict SST is 1.0655. This is much higher than the other two indexes, leading to a much worse prediction ability. This happens because the cosine distance uses the cosine of the angle of the SST vector to measure its similarity, which only reflects the trend of SST changes, and is not sensitive to the value of the SST itself. The DTW and Euclidean distances are based on the SST values. In summary, DTW has the highest prediction accuracy among the three. Therefore, we use only DTW in the remaining experiments, below.
Experiment 2: Verification of LSPSO. An LSPSO algorithm was proposed in this paper. To verify its effectiveness, three classical bi-objective test functions (BNH, SRN and TNK) were selected. GD and SP were used as evaluation indicators. The number of populations, N was set to 100, and the number of iterations G = 250 was compared with MODE, NSGA-II and OMOPSO, respectively. Each algorithm ran independently 30 times for each test function, computing the mean and variance of GD and SP values. Analysis of Variance (ANOVA) [42] was used to test the significant difference of the GD and SP indicators between LSPSO and the other models (MODE, NSGA-II, and OMOPSO). In general, P values smaller than 0.05 indicate that there is a significant difference. Figure 3,
Figure 4 and
Figure 5 show the solution obtained by LSPSO for BNH, SRN and TNK functions and the true Pareto frontier. The red circle is the optimal solution obtained by LSPSO, and the black line is the true Pareto front. The optimal solution sets of LSPSO are convergent and evenly distributed on the true Pareto front.
Table 3 shows the GD and SP of MODE, NSGA-II, OMOPSO, and LSPSO in solving BNH, SRN, and TNK, respectively. Overall, our LSPSO is better than the other three methods according to the mean and std values of GD and SP.
When dealing with the BNH function, our LSPSO method outperforms the other three methods in the uniformity of the solution set, achieving a statistical significance of SP (p < 0.001). This can be observed from
Figure 3, where the solution set of LSPSO is uniformly distributed on the Pareto front for BNH. The convergence of LSPSO is much better than that of MODE in term of GD (p < 0.001), but it is not significantly better than NSGA-II and OMOPSO (p = 0.37 and p = 0.15, respectively).
For the SRN function, the GD of LSPSO is significantly better than those on the other three methods (p < 0.001), indicating a good convergence. Regarding the uniformity of the solution set distribution (SP), although LSPSO is significantly better than MODE and NSGA-II (p < 0.001), it is worse than OMOPSO (p < 0.001). This maybe because the true Pareto frontier of the SRN function shows linearly distributed and global search has advantages when dealing with such problems, but when the true Pareto frontier of the test function is non-linear, our method can achieve much better results.
For the TNK function, the average and std of the GD and SP obtained by LSPSO are better than those on the other methods, indicating that LSPSO handles TNK well. According to the significance test, our LSPSO has similar performance with NSGA-II in terms of GD (p = 0.48), and with OMOPSO in terms of SP (p = 0.95).
In summary, LSPSO can better handle the bi-objective test functions, and provide effective support for the parameter optimization of DS. MODE, NSGA-Ⅱ and OMPSO have strong global search capabilities and insufficient local search capabilities, so the solution sets obtained are not uniform and easily fall into a local optimum. The global and local search capabilities of PSO are mutually constrained and tend to fall into local optimum in the later stages of search. In this paper, a local search strategy is used to enhance the local search capability of the PSO, so that the improved PSO has independent global and local search capabilities. Therefore, the obtained solution set has better convergence and more uniform distribution.
Experiment 3: Comparison of DSL performance with other SSTP methods. DSL is a combination of DTW + SVM (DS) and LSPSO. Define the space composed of the PL and the IS as the search space of the particle, and the prediction accuracy and efficiency set as the optimization objectives, LSPSO can compute the appropriate PL and IS that are set to the parameter of DS to predict SST. Here, we compare the performance of DSL with DTW, SVM, DS, LSTM in SSTP. In the situation of predicting 5-day SST with a given t-day of SST data, the DTW method is to find the most similar series of SST from historical data and take its following 5 days as the prediction. The SVM and LSTM both are trained by fitting nonlinear changes in SST. DS is to train a SVM by top-k similar series selected according to DTW.
The comparison results are shown in
Table 4. The average value of RMSE obtained by DS in predicting SST is 0.4468, which is lower than the average RMSE of the DTW. This indicates that the combination of DTW and SVM can effectively utilize the information after DTW mining. Secondly, the prediction results of the DS algorithm are better than those of SVM. This is because the SST sequence contains a lot of redundant information, which will interfere with the model during prediction, making the SVM prediction accuracy low.
LSTM gains the average RMSE of 0.5211, which is worse than DTW, DS and DSL. Although LSTM is developed for dealing with long and short-term prediction problem, it does not work well in predicting SST. This is probably due to the non-stationarity of SST.
DSL obtains the optimal parameters of PL and IS by LSPSO, and uses them in DS to predict SST. Comparing the RMSE values between DS and DSL, it can be found that the performance of DSL prediction of SST is better than DS, indicating that the parameters of DS can be effectively optimized by LSPSO, which was responsible for a 16.7% improvement in prediction accuracy, in terms of reducing the RMSE. In summary, the overall effect of DSL prediction of SST is optimal, indicating that the method can effectively predict SST and verify the effectiveness of the proposed method.
We demonstrated the predicted results by different SSTP methods.
Figure 6 shows a sample, randomly selected from the results. The black line represents the true values; the blue line represents the predicted results by using DTW; the red and yellow lines correspond to SVM and LSTM, respectively; the predicted results by using DSL are shown in green. It is clear that the results predicted by our method (green) are the closest ones to ground truth (black), including also the trend changes. On the other hand, the change trend of the other methods fluctuates significantly.
Finally,
Figure 7 shows the operating efficiency of DS at predicting SST, before and after LSPSO optimization. RT represents the running time in seconds. The first nine columns provide RT for nine different SST sequences, with the average values given in the last column. The red histogram is the RT of the DS before optimization, and the blue histogram is the RT of DS after optimization. It can be clearly seen that, for each SST series, the RT before optimization is much longer, with an average 76% acceleration. Our results verify the effectiveness of the proposed method.