1. Introduction
In recent years, under the pressure of global warming, deterioration of the human ecological environment, shortages of non-renewable energy resources, and environmental pollution, solar radiation energy has become highly valued worldwide as an inexhaustible clean energy source, and consequently, solar photovoltaic power has developed rapidly [
1,
2]. However, photovoltaic power generation is volatile and intermittent, and large-scale grid connections negatively impact the stability and security of the power grid, and even cause serious economic losses [
3,
4]. In order to increase the proportion of photovoltaic power generation in the power system, the key is to implement timely and effective power dispatching, where accurate photovoltaic power generation forecast is an important basis for the power dispatching process. However, the fluctuation of photovoltaic power generation is mainly caused by changes in solar irradiance. Therefore, it is important to accurately predict solar irradiance, the results of which can provide important decision support for power dispatching systems and can effectively reduce the operational costs of the power system [
5,
6].
There are many methods used for short-term and ultra-short-term solar irradiance predictions. Traditional prediction methods mainly use statistical methods to establish the relationship between the historical value and solar irradiances, such as time series and regression analysis methods [
7,
8]. However, traditional statistical methods cannot accurately describe the complex nonlinear relationship between various meteorological variables and the solar irradiance, which limits the improvement of the prediction accuracy.
With the rise of machine learning technology, many scholars have applied machine learning methods into solar irradiance prediction and have achieved good results [
9]. For example, support vector machine (SVM) [
10], extreme learning machine (ELM) [
11], and artificial neural network (ANN) methods [
12] have all been shown to produce better results than linear regression prediction methods when predicting solar irradiance. What’s more, machine learning methods, especially ANN, combining with Numerical Weather Prediction are also achieved great improvement on the hour-term or medium-term forecast [
13,
14,
15]. Furthermore, compared to traditional machine learning, deep learning, such as recurrent neural network (RNN), has shown the potential to further improve the prediction of solar irradiance [
16].
At the same time, with the development of hardware technologies, such as charge-coupled devices (CCDs), and the continuous improvement of digital image processing technology [
17], many total cloud-measuring remote sensing instruments have been successfully developed, such as the total sky imager (TSI), which can accurately monitor and collect cloud images over photovoltaic power stations in real time [
18]. The images have sufficient information that is more beneficial to the prediction of solar irradiance than historical observation values, such as cloud cover. However, the existing solar irradiance prediction methods based on the TSI have some disadvantages that cannot be ignored. For example, the artificial image feature extraction relies heavily on the experience of researchers and it is often difficult to obtain satisfactory prediction results [
19]. Based on this, Feng et al. [
20] designed a SolarNet model that can automatically extract the features of a total sky image, but this model only uses one total sky image as the model input, which ignores the cloud motion information and greatly limits the accuracy of the prediction. Zhao et al. [
21] designed a three-dimensional convolutional neural network (3D-CNN) model to realize the fusion of multiple images and historical values, and then input the fusion features into a multilayer perceptron (MLP). However, as a traditional neural network structure, an MLP cannot capture the long-term memory of an input time series because the nodes between the hidden layers are not connected. Therefore, an MLP often performs poorly when predicting a time series. The long short-term memory (LSTM) has a complex memory unit, which can remember the previous information and can apply it to the calculation of the current output, that is, the nodes between hidden layers become connected [
22]. Therefore, compared to an MLP, LSTM displays better performance when predicting a time series. In particular, long short-term memory (LSTM) networks have been used to predict solar irradiance due to their strong time series-learning ability [
23,
24].
Based on the shortcomings of the above model, this study developed a Siamese convolutional neural network-long short-term memory (SCNN–LSTM) model. A Siamese CNN can automatically extract the spatial dimension features of multiple continuous total sky images and can retain the temporal dimension features. Then, historical meteorological features and image features are fused using a concatenate layer, and the fused features are input into the LSTM for the prediction of solar irradiance within hours.
Since the direct normal irradiance (DNI) was vital to the concentrated solar thermal power plant and the global horizontal irradiance was important to photovoltaic solar power plant [
25], the DNI was taken as research target in this study to evaluate the performance of the proposed model. The two years’ data were corrected from the National Renewable Energy Laboratory (NREL) [
26], and several experiments were carried out to verify the effectiveness of the proposed method.
The main contributions of this study include: (1) A Siamese CNN was developed to automatically extract the features of continuous total sky images, where the Siamese structure reduced the model training time by sharing part parameters of the model; (2) SCNN-LSTM was used to effectively fuse the time-series features of images and meteorological data and to improve the DNI prediction accuracy.
The remainder of this paper is organized as follows:
Section 1 introduces the three correlation networks, based on which the proposed model was constructed.
Section 2 describes the collection and processing of the experimental materials.
Section 3 presents a SCNN–LSTM forecasting model of DNI.
Section 4 discusses the experimental results and analyzes the performance of the SCNN–LSTM model based on several comparative experiments. Finally,
Section 5 summarizes the conclusions.
3. SCNN-LSTM Prediction Model
In this section, a SCNN–LSTM model was designed to predict the 10-min ahead DNI, and the structure of SCNN-LSTM model was shown as
Figure 3. The cloud features were firstly extracted from a group of consecutive total sky images in order to make up the missing information of a single image blocked by the shadow-band [
30,
31]; and then the cloud features and meteorological variables were normalized and fused as inputs of LSTM to predict the clear-sky of DNI in the next 10 min.
3.1. Input Dimension
Bayesian information criterion (BIC) was used to determine the input dimension of the forecasting model using DNI clear-sky index; that is, the DNI at moment
t was related to the DNI at the previous
n moments [
32]. The BIC was used to determine the order of the DNI clear sky index sequence after the first-order difference, and the obtained BIC thermal diagram is shown in
Figure 4. The BIC information reached the minimum value when the autoregression coefficient was 1 and the moving average coefficient was 2. The DNI clear sky index sequence went through a difference such that the order of the model was determined to be 3, which means that the information at time
t − 2Δ
t,
t − Δ
t, and
t predicted the DNI clear sky index at time
t + Δ
t, where Δ
t is 10 min.
3.2. Siamese Convolutional Neural Network
The convolutional neural network (CNN) is one of the representative algorithms of deep learning. It has the ability of representation learning and is able to extract high-order features from inputs [
33]. The main structure of a traditional CNN includes convolutional layers, pooling layers, and fully connected layers. A Siamese network [
34] is a class of neural networks that consist of two or more identical subnetworks, and the subnetworks have the same network structure and configuration, including the network parameters and weights. During the training phase, the parameter updates are mirrored across multiple subnetworks.
The proposed Siamese convolution neural network took the advantages of CNN and Siamese network, and it was used to extract the high-order features from ground-based sky images at different times. The branch of the SCNN structure was improved based on the AlexNet network [
35]. Because the images of the three input moments (i.e.,
t − 2Δ
t,
t − Δ
t, and
t) need to be processed, the SCNN has three improved AlexNet subnetworks, and the structure of the SCNN was shown as
Figure 5. The network structure, parameters, and weights of the three subnetworks are the same, and the three inputs determine how the weights are updated.
3.3. Long Short-Term Memory
Long short-term memory (LSTM), a variant of a recurrent neural network (RNN), usually performs better than an RNN when predicting outcomes [
36]. In LSTM, every neuron is a memory cell and there are three gates in each cell, namely, the forgetting gate
, the input gate
, and the output gate
:
where
represents the output at the previous moment;
represents the input at the current moment, and it is the fused features of cloud features and meteorological variables;
represents the sigmoid function,
represents the weight, and
represents the bias. The process of operation of the whole unit structure decides which information should be discarded in the memory cells of the last moment by multiplying the forgetting gate
by the previous cell state
. Then, the new information is obtained by multiplying the input gate
with the alternative content
that needs to be updated. According to the above system of equations, the cell state
at the current moment can be obtained by discarding and updating the information. Finally, the
status value is pushed from −1 to 1 through the tanh layer, and the output
at the current moment is obtained by multiplying the tanh layer by the output gate
, which is the DNI clear sky index for the next 10 min; therefore, the DNI at that moment is obtained by multiplying by the predicted clear-sky index by the DNI of the clear sky at the same moment.
3.4. Loss Function
In this study, the greatest difference from a traditional Siamese network is that the SCNN section of the SCNN–LSTM model was designed to find the key features in the total sky images at different times in order to provide the image timing features for the LSTM prediction instead of to compare the similarity between these images. Therefore, there was no need to calculate the Euclidian distance between sample pairs to judge their similarity.
Secondly, the SCNN–LSTM model does not need to use the contrastive loss function to represent the degree of matching between paired samples. Instead, it uses the predicting error (PE) to evaluate the difference between the predicted value and the observed value to train the SCNN–LSTM parameter, as follows:
where
is the predicted value of a forecast model,
is the target, and
N is the number of training samples.
Just as in a traditional Siamese network, the multiple subnetwork branches of the SCNN in this model also have the same network structure, parameters, and weights. In the implementation of the SCNN structure, only one network structure needs to be built, and then the network structure is mirrored to multiple subnetworks with different inputs. All the subnetworks jointly determine how the weights are updated in the network.
During the process of the model training, the weights of the whole model were adjusted simultaneously. First, the PE was used to evaluate the prediction error of the model; then, the error was propagated back to the fully connected layers, the LSTM layer, and the SCNN in turn. Notably, the weight-sharing between the three branches of the SCNN was realized via mirroring, and the weights of the three branches were uniformly determined using the input of the three branches.
5. Conclusions
In this study, a SCNN-LSTM model was proposed for predicting the DNI 10-min ahead. First, a SCNN was built using three component networks by improving the AlexNet network, which independently extracts features from three total sky images and produces the image characteristics at three moments, and the meteorological integration of historical observations. The fusion feature was implemented after the LSTM, and the two fully connected layers output the DNI clear sky index prediction. To obtain the solar irradiance, the previous result was multiplied by the DNI clear sky predictive value. Using the NREL open dataset of the whole year of 2014 as the testing set, the experimental results show that the nRMSE of the SCNN-LSTM model was 23.47% and the forecast skill was 24.51%. Compared to other models used in this study, the prediction accuracy was improved.
This experiment also provided some inspiration for our future work. For example, DNI data could be classified based on weather conditions or cloud classifications, and then DNI prediction models could be constructed for different weather conditions in order to further improve the prediction accuracy, especially under partly cloudy or cloudy days. In addition, we cloud try to adjust the number of samples under different weather conditions to balance the prediction accuracy of different weather conditions, so as to improve the overall prediction performance.