*5.5. Transfer Learning Capacity Test*

We have validated the transfer learning capacity of the proposed method to satisfy the needs in the real-life. For instance, one new company wants to predict electricity consumption, but they do not have enough historical data to train the model. It requires the model with an excellent transfer learning capacity. The following experiments are designed to test the transfer learning capacity of the proposed method, as described in Table 11. We adopted the training part of AEP as the training set to train the model for VSTF, the training part of COMED to train the model for STF, and the training part of DAYTON to train the model for MTF and LTF. The testing part of others is utilized to test.


**Table 11.** Experiment design for transfer learning capacity test of the proposed deep model.

The DNN [34] and the proposed MCSCNN–LSTM applied the same data to train and test are considered as comparative experiments to validate the transfer learning capacity. For example, we trained DNN and MCSCNN–LSTM with the training part of COMED, DAYTON, and tested on the testing part of the same data set for VSTF. The results as shown in Figure 8; the *x*-axis is the testing part of each data set. The results indicate the proposed method has a functional transfer learning capacity, which outperforms DNN [34] for all kinds of forecasts, and a little lower than the proposed method using the same data to train and test the model. We performed a *t*-test to quantify this difference. The results of the *p*-value are shown in Table 12. If a *p*-value is higher than 0.05, it means there is no significant difference. The results show there was no significant difference when we utilized different companies' data for training the model. Moreover, even though DNN [34] employed the same source data to train and test model, its performance is worse than "transfer". Notably, there is a significant improvement for the VSTF of electricity consumption compared to DNN. In summary, Figure 8 and Table 12 confirmed that the proposed method has an excellent transfer learning capacity against noisy data.


**Table 12.** The *p*-value of significance test using *t*-test.
