A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction

Huang, Xu; Wang, Leying; Ge, Leijiao; Hou, Luyang; Du, Tianshuo; Zheng, Yiwen; Chen, Yanbo

doi:10.3390/electronics13173536

Open AccessArticle

A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction

by

Xu Huang

¹,

Leying Wang

^1,*,

Leijiao Ge

¹

,

Luyang Hou

²

,

Tianshuo Du

¹,

Yiwen Zheng

¹ and

Yanbo Chen

³

¹

School of Electrical Automation and Information Engineering, Tianjin University, Tianjin 300072, China

²

School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, China

³

State Key Laboratory of Aiternate Electrical Power System with Renewable Energy Sources, North China Electric Power University, Beijing 102206, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(17), 3536; https://doi.org/10.3390/electronics13173536

Submission received: 8 August 2024 / Revised: 30 August 2024 / Accepted: 3 September 2024 / Published: 6 September 2024

(This article belongs to the Special Issue Advances in Enhancing Energy and Power System Stability and Control)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate PV power prediction is crucial for enhancing grid planning, optimizing dispatch operations, and advancing management strategies. In pursuit of this objective, this study proposes a short-term distributed PV power prediction method that incorporates temporal and spatial feature extraction as well as similar day analysis. Firstly, to address the poor adaptability of traditional clustering methods to time-series data, the K-shape clustering algorithm is employed to categorize the time series into different weather types. Secondly, to overcome the challenges posed by varying time resolutions in similar day analysis, a novel method based on Dynamic Time Warping (DTW) is proposed. This method calculates the similarity between the target days and the days to be collected, considering both the time of day and the day of the week. Subsequently, a PV power generation prediction model based on a convolutional long short-term memory (CNN-LSTM) network is developed to enhance prediction accuracy. To tackle the difficulty of manual hyperparameter tuning, the chaos reverse sparrow search algorithm (CRSSA) is introduced. Finally, a case study is conducted on the measured data of a distributed photovoltaic power station in a certain region of China. By comparing RMSE and MAPE, compared with other prediction models, the proposed prediction model and solving algorithm effectively reduced the relative error by more than 1%, verifying the effectiveness of the proposed method.

Keywords:

photovoltaic; power prediction; convolutional neural network; long short-term memory; sparrow optimization

1. Introduction

Amid the backdrop of severe energy shortages and global warming, renewable energy, characterized by its clean, low-carbon, and sustainable nature, is playing an increasingly crucial role in the formulation of national energy strategies [1]. Photovoltaic (PV) power generation has gained widespread attention globally due to its non-polluting, renewable, and low-cost characteristics, as well as its technological maturity [2]. However, the inherent non-storage and intermittent nature of solar energy presents significant challenges to the power grid when scaled up for widespread use [3]. Therefore, accurately predicting photovoltaic power generation is imperative for optimal grid dispatch, enhanced management, and improved energy consumption efficiency. It is also a critical factor in achieving complementary power relationships within the grid.

With continuous advancements in artificial intelligence technology, deep learning has garnered widespread attention due to its excellent performance in image processing and speech recognition [4]. Consequently, some researchers have introduced deep learning into the field of PV power prediction. Compared to traditional machine learning models, deep learning models offer more accurate prediction results owing to their superior feature extraction and data mining capabilities [5]. Among these, models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) show promising results in PV prediction. For instance, Ref. [6] proposed a hybrid approach based on deep CNN for short-term PV power forecasting using solar radiation, temperature, and historical electricity data. Ref. [7] developed a meteorological-information-based long short-term memory (LSTM) model to predict the daily power generation of a large-scale PV power plant by classifying weather conditions. Ref. [8] used global solar radiation, temperature, velocity, relative humidity, and power output as inputs, employing an LSTM model optimized by the Particle Swarm Algorithm (PSO) to predict PV power across an entire region, demonstrating the method’s accuracy experimentally. However, the performance of a single model often falls short when meeting dealing with large data samples, leading many researchers to propose hybrid models to improve prediction accuracy. Ref. [9] proposed a hybrid Wavelet-PSO-SVM prediction model supported by supervisory control and data acquisition (SCADA) systems, and meteorological information has been shown to effectively enhance prediction accuracy. Ref. [10] improved the prediction accuracy of distributed PV power by fusing multiple models based on a stacking integration strategy. Considering the prediction effectiveness of previous studies, this study develops a CNN-LSTM model to fuse the feature extraction function of CNNs and the timing analysis capability of LSTM.

While the studies mentioned above proposed effective methods for predicting PV power, they often overlook the potential information gain from similar day samples concerning input features. To address this gap, weather conditions are classified into three categories using fuzzy C-means clustering and selected samples with higher similarity to the target day for training, thereby improving prediction accuracy [11], while Ref. [12] employed the K-medoids clustering algorithm to categorize weather into three groups and validated the effectiveness of the prediction model under various weather conditions. By pre-dividing the weather types in the dataset and selecting samples that closely resemble the target day for training, prediction accuracy can indeed be enhanced. However, traditional clustering algorithms typically do not account for the shape information of time-series data, which complicates the effective resolution of clustering challenges in this context. Additionally, conventional similar day analysis methods, such as Pearson and Euclidean distance, struggle to provide accurate similarity measures when dealing with varying time resolutions.

To address the shortcomings of the aforementioned studies and build upon their foundations, this study proposes a CRSSA-CNN-LSTM prediction method that incorporates clustering to classify weather types. This method effectively solves the problems of poor adaptability of traditional clustering methods to time series and difficulty in analyzing similar days caused by different time resolutions, improving prediction accuracy. This paper is organized as follows: First, two data preprocessing techniques are introduced to enhance the quality of the input data. Next, to construct similar day sample sets for the target prediction days, the K-shape clustering algorithm is utilized, along with the implementation of Dynamic Time Warping (DTW) to measure similarity effectively. To achieve more accurate predictions, this study proposes a hybrid ICNN-LSTM model, which combines the strengths of convolutional neural networks and long short-term memory networks. The hyperparameters of this model are optimized using the improved sparrow search algorithm (ISSA). Finally, the hybrid ICNN-LSTM model is applied to actual PV power plant data from a region in Nanjing, China, to conduct simulations that verify the effectiveness of the proposed method.

2. Similar Day Selection Based on K-Shape and DTW

Selecting samples that closely resemble the forecast target meteorological conditions can enhance the relevance of the training data and improve the effectiveness of the training model, thereby increasing prediction accuracy. To achieve this, this study employs K-shape clustering to classify PV power into different modes. Additionally, the similarity of time-series data at varying time resolutions is assessed using DTW.

2.1. Dataset Description

In this study, we evaluate the proposed aggregated prediction method using data from a PV plant with an installed capacity of 20 kW located in a region of China. The dataset spans from January to November 2017 and includes features, such as output power, solar radiation intensity, ambient temperature, relative humidity, barometric pressure, and wind speed, recorded at 15 min intervals. The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

2.2. Data Preprocessing

The feature variables influencing the prediction of photovoltaic power generation often exhibit different scales and orders of magnitude, necessitating feature scaling. The multidimensional features that have been scaled are unified to a dimensionless state and have similar scales, which significantly promotes the convergence efficiency of the gradient descent algorithm. Normalization is a common choice among various processing methods. However, due to its strong dependence on the maximum and minimum values within the dataset, normalization needs to be redefined every time a new extremum is encountered. Therefore, this study uses standardization techniques as the core means of data preprocessing.

The method of normalizing the time-series data is shown in (1), which transforms them into a mean value of 0 and a standard deviation of 1:

x^{'} = \frac{x - \bar{x}}{σ (x)}

(1)

where

x^{'}

is the standardized data, x is a sample of the original data for a particular characteristic,

\bar{x}

is the sample mean, and

σ (x)

is the sample standard deviation.

During the process of information acquisition, communication failures or human errors can result in a certain amount of missing values. The handling of missing data can generally be categorized into two approaches: deletion and imputation. Directly deleting samples with missing values can lead to the loss of important information, significantly impacting the feature extraction process and the overall quality of the time-series data. To address this issue, this study employs the Random Forest Linear Filling method for managing missing data [13] to enhance prediction accuracy.

2.3. K-Shape Clustering Algorithm

Traditional clustering algorithms often struggle to effectively measure the similarity between time series. The K-shape clustering algorithm, however, has demonstrated superior performance in clustering time-series data, making it particularly suitable for applications that involve such challenges [14]. In this study, we utilize the K-shape clustering process to classify the daily variation curves of distributed PV power generation into K distinct patterns.

The K-shape algorithm utilized the standard method of mutual correlation to calculate the shape distance (SBD). Mutual correlation is a statistical method that can be used to measure the similarity of two time series,

X = [x_{1}, x_{2}, \dots, x_{m}]

and

Y = [y_{1}, y_{2}, \dots, y_{m}]

. It uses the mutual correlation between two sequences to optimally shift the time window of Y, aligning it globally with X. This alignment enables a comprehensive comparison of their global shape features, as illustrated in (2):

Y (s) = [\begin{array}{l} [\overset{| s |}{\overset{︷}{0, \dots, 0}}, y_{1}, y_{2}, \dots, y_{m - s}] s \geq 0 \\ [y_{1 - s}, \dots, y_{m - 1}, y_{m}, \underset{| s |}{\underset{︸}{0, \dots, 0}}] s \geq 0 \end{array}

(2)

where Y(s) is the shifted oscillatory data; s is the shift amount and

s \in [- m, m]

. If s > 0, the time window of Y is shifted to the right by

| s |

units; if s < 0, the time window of Y is shifted to the left by

| s |

units.

Considering all the shifts of

s \in [- m, m]

, a sequence of interrelations of length 2m − 1 can be obtained by

C C_{ω} (X, Y) = [c_{1}, c_{2}, \dots, c_{2 m - 1}]

, and

c_{ω}

is defined as follows:

c_{ω} = R_{ω - m} (X, Y) = R_{k} (X, Y) ω \in [1, 2, \dots, 2 m - 1]

(3)

where

k = ω - m

. When

k \geq 0

,

R_{k} (X, Y) = \sum_{l = 1}^{m - k} (x_{l + k} y_{l})

; when k < 0,

R_{k} (X, Y) = R_{- k} (X, Y) = \sum_{l = 1}^{m + k} (y_{l - k} x_{l})

.

Then,

c_{ω}

reaches its maximum value and obtains a value of

ω

. The optimal translation of Y with respect to X can then be derived by

s = ω - m

. To eliminate the effect of sequence distortion,

C C_{ω} (X, Y)

should be normalized by (4) as follows:

C_{n, ω} (X, Y) = \frac{C C_{ω}}{\sqrt{R_{0} (X, X) R_{0} (Y, Y)}}

(4)

where

C_{n, ω} (X, Y)

is the normalization factor.

Therefore, SBD between X and Y can be calculated by (5):

D_{S B D} (X, Y) = 1 - \max_{ω} (\frac{C C_{ω} (X, Y)}{\sqrt{R_{0} (X, X) \cdot R_{0} (Y, Y)}})

(5)

where

D_{S B D} (X, Y)

ranges from 0 to 2.

D_{S B D} (X, Y) = 0

indicates that X and Y are perfectly similar.

2.4. Dynamic Time Warping (DTW)

In the prediction scenario, the time accuracy of the day to be analyzed often does not match the sampling accuracy of the photovoltaic output power. In addition, the significant randomness and volatility of photovoltaic output make it difficult to establish a direct correspondence between the two in the time series. To address this challenge, we introduce DTW (Dynamic Time Warping) technology to evaluate the similarity between the two. The advantage of DTW lies in its ability to nonlinearly align two curves, identifying the optimal correspondence between them. This method effectively matches points where the curves exhibit similar shapes, resulting in a more accurate morphological measurement [14].

As illustrated in Figure 1, DTW can identify and map similar points between two curves. These mappings allow each point in one curve to correspond to the most similar point in the other curve. The total distance between all corresponding points is then calculated and serves as a criterion for assessing the similarity between the two curves. The lengths of curves X and Y are defined as m and n. We then construct the distance matrix D as

D (x_{i}, y_{j}) = {(x_{i} - y_{j})}^{2} .

(6)

The curved path between two curves is defined as

P = {p_{1}, p_{2}, \dots, p_{s}, \dots, p_{l}}

. The subscript s denotes the coordinates of s-th point on the curved path, corresponding to

p_{s} = D (x_{i}, y_{j})

. The subscript l denotes the number of elements in the path. A curved path represents the mapping relationship between two curves, visually illustrated by the lines connecting corresponding points on the curves in Figure 1.

The objective of DTW is to identify an optimal curved path that minimizes the cumulative distance between two time series, while adhering to the constraints of boundary conditions, monotonicity, and continuity:

D T W (X, Y) = \min \sum_{s = 1}^{l} p_{s}

(7)

2.5. Similarity Day Selection Process

Initially, the output from distributed PV systems is categorized into K distinct patterns using the K-shape clustering algorithm. The feature vectors for the normalized data from the days to be collected and the historical data are organized as follows:

P_{0} = [P_{0} (1), P_{0} (2), \dots, P_{0} (m)]

(8)

P_{k} = [P_{k} (1), P_{k} (2), \dots, P_{k} (n)]

(9)

where

P_{0}

is the eigenvector of the day to be collected;

P_{k}

is the eigenvector of the center of mass of the k-th PV output pattern; m is the number of forecast points of the day to be predicted; n is the number of sampling points of the historical days; in the absence of a priori knowledge, K will be selected by the profile coefficients to derive the number of clusters with the highest intra-cluster similarity and the lowest inter-cluster similarity. The number of clusters with “highest similarity within clusters and lowest similarity between clusters” is obtained.

In forecasting, DTW is utilized to assess the similarity between historical sample clusters and the target day that is to be predicted:

S = D T W (P_{0}, P_{k})

(10)

The clusters of samples with the highest similarity are selected to train the prediction model, thereby improving the prediction accuracy.

3. Distributed PV Power Prediction Model

The methods outlined above generate high-quality input data for the prediction task, and the design of a more efficient prediction model has a direct impact on accuracy. To address this, this study proposes a hybrid prediction method that combines LSTM networks and CNN to enhance prediction accuracy through the effective extraction of both temporal and spatial features.

3.1. Convolutional Neural Networks

In this study, we thoroughly consider the spatial and temporal correlations within PV time-series data. Initially, spatial features are extracted using feature extraction methods to enhance the model’s local prediction performance while also reducing dimensionality. Traditional feature extraction techniques, such as principal component analysis and linear discriminant analysis, have inherent limitations when it comes to nonlinear feature extraction and generalization. In contrast, CNN can effectively capture essential local features from the input data through convolutional and pooling layers, thereby improving the model’s prediction accuracy [15]. In addition, the outstanding ability of CNN models in feature extraction provides strong support for reducing uncertainty in prediction during periods of sharp power fluctuations.

3.2. Long- and Short-Term Memory Networks

LSTM networks excel at capturing temporal feature trends in PV output power data and have been extensively utilized in PV power generation forecasting, yielding favorable outcomes [16]. In this study, we integrate LSTM with CNN to learn the spatio-temporal features of PV data, with the goal of achieving high-precision predictions for PV power generation. The underlying principles of LSTM are as follows.

LSTM introduces a state unit c that builds upon the architecture of a recurrent neural network (RNN). It regulates the information state of the LSTM network at each moment using three key components: the forgetting gate, the input gate, and the output gate. The internal structure of the LSTM is illustrated in Figure 2. The LSTM unit has three inputs at time t: the input of the network

x_{t}

at the current moment, which is the historical power information after the feature extraction by CNN; the output of LSTM implicit layer; and the unit state of the previous moment. The LSTM unit has two outputs at time t: the output of the implicit layer at the current moment and the unit state. The output of LSTM hidden layer

h_{t - 1}

at the previous moment; and the unit state

c_{t - 1}

at the previous moment. The LSTM unit has two outputs at t: the output of the hidden layer

h_{t}

and the unit state

c_{t}

at the current moment.

The forgetting gate is utilized to control the information content of the previous moment cell state

c_{t - 1}

saved to the current cell state

c_{t}

. The input gate is used to control the information content of the input

x_{t}

of LSTM at the current moment saved to the cell state

c_{t}

; and the output gate controls the amount of information of the current cell state

c_{t}

as the output of the implicit layer

h_{t}

at the current moment. The above process is depicted as follows:

f_{t} = σ (W_{f} [h_{t - 1}, x_{t}] + b_{f})

(11)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(12)

\tilde{c_{t}} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(13)

c_{t} = f_{t} \circ c_{t - 1} + i_{t} \circ \tilde{c_{t}}

(14)

o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o})

(15)

h_{t} = o_{t} \circ \tanh (c_{t})

(16)

where

f_{t}, i_{t}, o_{t}

denote the gating coefficients of the LSTM’s oblivion gate, input gate, and output gate, respectively;

b_{f}, b_{i}, b_{o}, b_{c}

are the bias terms of the corresponding cells, respectively;

W_{f}

,

W_{i}

,

W_{c}

, and

W_{o}

are the weight matrices of the corresponding cells, respectively;

\circ

is the element-by-bit multiplication;

\tilde{c_{t}}

is the state of the candidate cell; and

σ

denotes the sigmoid activation function.

3.3. CNN-LSTM Hybrid Neural Network Model

The CNN-LSTM hybrid neural network structure proposed in this study consists of two different neural networks connected in series, as depicted in Figure 3. In this structure, the CNN convolutional layer applies filters to the input data, iteratively sliding across it to extract features related to PV data. The pooling layer performs feature downsampling, which reduces the number of parameters and helps prevent model overfitting. The fully connected layer is responsible for establishing the mapping relationship between the high-quality features identified in the previous layers and the final prediction results. One of the key advantages of CNNs in predicting PV power generation is their ability to automatically learn spatio-temporal features from the data, such as variations in daylight and weather conditions, without requiring manual feature extraction. The output from the CNN will then serve as input to the LSTM to capture temporal feature information. In addressing the regression problem, the hybrid CNN-LSTM network will leverage the weights and bias parameters of the fully connected layer to effectively map the extracted feature vectors to the predicted PV power generation outcomes.

4. Hyperparameter Adaptive Tuning

The multi-distributed PV prediction method proposed in this study integrates CNN and LSTM models. To achieve optimal performance, it is essential to fine-tune the hyperparameters of both models; however, relying solely on manual experience for this adjustment can be challenging. Therefore, this study introduces a hyperparameter tuning method based on an improved sparrow optimization algorithm aimed at obtaining the optimal set of hyperparameters that minimizes the error, as defined by the objective function. To enhance the global search capability of the optimization algorithm, two improvement strategies are introduced to increase the diversity of the solution set.

4.1. Sparrow Optimization Algorithm

The sparrow search algorithm (SSA) is a group intelligence optimization algorithm inspired by the foraging and anti-predation behavior of sparrows, which has the characteristics of fast iteration and convergence speed and high solution efficiency compared with heuristic algorithms such as the Particle Swarm Optimization Algorithm (PSO) and Grey Wolf Optimization Algorithm (GWO) [17]. The specific principles are as follows:

We define the number of sparrows in SSA as N and the search dimension as D. Each sparrow position is denoted as:

X = [\begin{array}{l} x_{11} x_{12} \dots x_{1 d} \dots x_{1 D} \\ x_{21} x_{22} \dots x_{2 d} \dots x_{2 D} \\ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ \\ x_{i 1} x_{i 2} \dots x_{i d} \dots x_{i D} \\ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ \\ x_{N 1} x_{N 2} \dots x_{N d} \dots x_{N D} \end{array}]

(17)

(1) SSA classifies sparrows into three roles based on their foraging behavior: discoverers, joiners, and early warners. Among them, the discoverer is characterized by its adaptability and wide search range. Discoverers are responsible for locating food and guiding the population in foraging. The positional iteration formula for the discoverer role is as follows:

x_{i d}^{t + 1} = {\begin{cases} x_{i d}^{t} \cdot \exp (\frac{- i}{α M a x I t e r}) R_{2} < S_{T} \\ x_{i d}^{t} + Q L R_{2} \geq S_{T} \end{cases}

(18)

where MaxIter is the maximum number of iterations, t denotes the current number of iterations,

x_{i j}

denotes the position information of sparrow i in the dth dimension.

α \in (0, 1]

is a random number;

R_{2} \in [0, 1]

represents the warning value;

S_{T} \in [0.5, 1]

represents the safety value; Q is a random number obeying normal distribution; L denotes the matrix with the dimension

1 \times d

, and all elements are set to 1. When

R_{2} < S_{T}

, it indicates that the environment is safe with no predator around the forager; as a result, it can search widely.

R_{2} \geq S_{T}

indicates that the environment is dangerous, meaning the foraging strategy needs to be adjusted such that the group moves to the safe area.

(2) The joiner follows the discoverer in searching for food and may compete with it to increase its own foraging rate. If the discoverer fails to take the lead in the foraging process, the joiner will update its foraging route to seek out other, more adaptive producers. The positional iteration formula for the joiner role is as follows:

x_{i d}^{t + 1} = {\begin{cases} Q \cdot \exp (\frac{x_{W d}^{t} - x_{i d}^{t}}{i^{2}}) i > N / 2 \\ x_{B d}^{t + 1} + \frac{1}{D} \sum_{d = 1}^{D} (rand {- 1, 1} \cdot | x_{i d}^{t} - x_{B d}^{t + 1} |) i \leq N / 2 \end{cases}

(19)

where

x_{W d}^{t}

denotes the location of the worst sparrow in dimension d in the number of t + 1 iterations;

x_{B d}^{t}

denotes the location of the optimal sparrow in dimension d in the number of t iterations. When

i > N / 2

, the less adapted sparrow i is in a very hungry state and needs to shift the foraging area; when

i \leq N / 2

, the more adapted sparrow i forages near the optimal location under the current iteration of spiking genus.

(3) Early warning agents make up 10% to 20% of the sparrow population, and their initial position is determined randomly to notify other sparrows of anti-predation behavior when faced with danger, with the following iterative positional formulae:

x_{i d}^{t + 1} = {\begin{cases} x_{B d}^{t} + β \cdot | x_{i d}^{t} - x_{P d}^{t} | f_{i} \neq f_{g} \\ x_{i d}^{t} + K (\frac{x_{i d}^{t} - x_{B d}^{t}}{| f_{i} - f_{w} | + ε}) f_{i} = f_{g} \end{cases}

(20)

where

β

and K are step coefficients, where

β \sim N (0, 1)

is a normally distributed random number,

K \in [- 1, 1]

represents the direction of the sparrow’s movement;

ε

represents a very small constant, to avoid the zero value of denominator;

f_{i}

is the fitness of sparrow i;

f_{w}

and

f_{g}

represent the worst and optimal fitness of the current sparrow group, respectively. When

f_{i} \neq f_{g}

, the sparrow is at the edge of the group; when

f_{i} = f_{g}

, the current sparrow is located in the middle position of the group.

4.2. Improvement Programmes

(1): Chaos Initialization Strategy

Traditional sparrow optimization algorithms initialize the population through random numbers, and this random generation may make the generated individuals unevenly distributed, leading to a reduction in population diversity and optimization search speed. To address these challenges, this study proposes generating the sparrow population using a chaotic mapping strategy. Commonly utilized chaotic mappings include cat mapping, tent mapping, and Singer mapping. Research has indicated that tent chaotic mapping offers significant advantages in optimization algorithms regarding traversability, uniformity, regularity, and iteration speed [18]. Therefore, in this study, we will employ tent mapping to generate the population sequence, expressed as follows:

x_{t + 1}^{i} = {\begin{cases} 2 x_{t}^{i} 0 \leq x_{t}^{i} \leq 0.5 \\ 2 (1 - x_{t}) 0.5 < x_{t}^{i} \leq 1 \end{cases}

(21)

(2): Reverse Learning Strategy

Reverse learning is inspired by the relationships between entities in the real world, aiming to find a corresponding reverse solution based on the current solution by utilizing the functions of upper and lower bounds alongside the optimal solution. Subsequently, the strategy of population merging is employed to combine the reverse learning population with the original population. To this end, this study integrates the reverse learning strategy into the sparrow algorithm, enhancing the ability of solving efficiency and global search capability, to avoid problems such as premature convergence. This integration can be mathematically characterized as follows:

{X^{'}}_{b e s t} (t) = u b + r \oplus (l b - X_{b e s t} (t))

(22)

where

{X^{'}}_{b e s t} (t)

denotes the inverse solution of the optimal solution under the t-th iteration, ub and lb are the upper and lower bounds of the decision variables, and r is a matrix of 1 × d (d is the spatial dimension) random numbers obeying a (0,1)-standard uniform distribution.

4.3. Objective Functions and Constraints

In heuristic optimization frameworks, it is crucial to construct a quantitative indicator that reflects the predictive performance of the model as the objective function. In order to comprehensively and accurately evaluate the performance of the model, the K-fold cross-validation strategy is introduced, which ensures that each data fold participates in the training and validation process. This strategy not only avoids the risk of the model’s generalization ability decreasing due to overfitting of training data but also solves the problem of biased evaluation results caused by scarce validation set data. In this study, we utilize the average value of K-fold cross-validation coupled with the mean squared error as the objective function:

\min f = - \frac{1}{2 K_{V}} \sum_{i = 1}^{K_{V}} \frac{1}{N_{V}} \sum_{j = 1}^{N_{V}} {({\hat{y}}_{j} - y_{j})}^{2}

(23)

where

K_{V}

represents the number of folds for K-fold cross-validation;

N_{V}

represents the number of samples in each fold;

{\hat{y}}_{j}

represents the predicted value of the j-th sample; and

y_{j}

represents the true value of the j-th sample.

The decision variables of the optimization model are the settings of the hyperparameters, with the constraints as:

p m a x \leq p \leq p m i n

(24)

where

p

represents the set of hyperparameters of the model; pmax and pmin are the upper and lower limits of the set of hyperparameters in the model.

4.4. Distributed PV Power Prediction Process

In this study, a multi-PV aggregation prediction model based on CRSSA-CNN-LSTM is proposed, and the specific flow is illustrated in Figure 4. The main steps of the algorithm are as follows:

(1): Data preprocessing: This step involves filling in missing values and normalizing the distributed PV power data using the method proposed in Section 1.
(2): Similar day selection: The output of a PV plant is categorized into K patterns using the K-shape clustering algorithm. The similarity between the day to be predicted and each output pattern is assessed using DTW. The sample cluster that exhibits the highest similarity is selected for training.
(3): Hyperparameter optimization: The ISSA is used to optimize various hyperparameters for the model, which includes the model learning rate, the number of batch samples, the number of network layers, and convolutional kernel size for CNNs, as well as the number of network layers and cells for LSTMs.
(4): Model training: A CNN-LSTM model is trained to predict the output power for the target day. The training utilizes output power data from the selected historical sample clusters along with meteorological features obtained from the monitoring system, which include irradiance, relative humidity, ambient temperature, barometric pressure, and wind speed. These inputs serve to enhance the model’s ability to accurately forecast the PV output power.
(5): Model evaluation: The performance of the model is evaluated by a test set.

5. Experiment Analysis

5.1. Data

In this study, we evaluate the proposed aggregated prediction method using data from a PV plant with an installed capacity of 20 kW located in a region of China. The dataset spans from January to November 2017 and includes features, such as output power, solar radiation intensity, ambient temperature, relative humidity, barometric pressure, and wind speed, recorded at 15 min intervals. To validate the effectiveness of the proposed method, we set up three sets of experiments to compare prediction performance: (1) we compare the prediction model proposed in this study with various baseline models to assess its predictive accuracy and performance; (2) we conduct experiments to evaluate the prediction accuracy before and after incorporating the division of weather types, measuring the impact of this consideration on model performance; (3) we compare the optimization algorithm proposed in this study with other existing optimization algorithms to evaluate its efficiency and effectiveness in enhancing the model’s performance.

5.2. Indicators

To assess the performance of the proposed model, the predictive performance of the model is assessed by mean absolute error (MAE), mean absolute percentage error (MAPE), and coefficient of determination index (R-square, R²). Specifically, smaller values of MAE and MAPE indicate that the proposed model performs better. Furthermore, smaller MAE and MAPE values signify improved accuracy in the prediction model, while a coefficient of determination closer to 1 reflects a better fitting result between the predicted and actual values. The MAE and MAPE can be calculated using the following formulae:

γ_{RMSE} = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}

(25)

γ_{MAPE} = \frac{1}{n} \sum_{i = 1}^{n} \frac{| {\hat{y}}_{i} - y_{i} |}{y_{i}}

(26)

γ_{R^{2}} = 1 - \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i = 1}^{n} {({\hat{y}}_{i} - \bar{y})}^{2}}

(27)

5.3. Feature Selection

The dependence of photovoltaic output on weather factors can enhance model prediction accuracy and mitigate the risk of overfitting by eliminating irrelevant or redundant features. Therefore, this section employs Pearson coefficients, Spearman coefficients, and MIC to analyze both linear and nonlinear correlations between the influencing factors and PV output power. The results of the correlation coefficient calculations are presented in Table 1. The analysis reveals a significant correlation between solar intensity and photovoltaic power, with Pearson, Spearman, and MIC values of 0.98, 0.92, and 0.9, respectively. Additionally, the Pearson coefficients for relative humidity and ambient temperature both exceed 0.35, indicating that these factors are also highly correlated with PV power output. Based on these findings, irradiance, relative humidity, and ambient temperature are selected as the input features from the original dataset, as they are the most influential in predicting PV power output.

5.4. Cluster Number Analysis

In this section, PV power data and selected meteorological factors are utilized as the clustering indices. The contour coefficient is employed as the evaluation metric for clustering effectiveness, and the results are illustrated in Figure 5. The analysis indicates that starting from three clustering categories, the contour coefficient gradually decreases as the number of clusters increases. Notably, the clustering effect for meteorological factors is relatively weak compared to the clustering of PV output power. The contour coefficient for using the power envelope parameter as a clustering index is significantly higher than that of the meteorological factors. Specifically, when the number of clusters is set to three, the contour coefficient reaches 0.7466, which is considerably better than the results obtained with other clustering indicators. Based on these findings, a clustering configuration of three clusters is selected as the optimal result for this study, demonstrating the most effective representation of the data characteristics.

5.5. Analysis of Forecast Results

This section outlines a methodology based on the joint prediction model of CRSSA-CNN-LSTM. To validate the effectiveness of the proposed method under different temporal resolutions, the samples from each cluster are divided into two subsets: a training set comprising 80% of the data and a training set comprising the remaining 20%. The training set samples are further categorized into three distinct patterns, as illustrated in Figure 6. The analysis reveals that Weather Pattern I exhibits a gentle distribution, resembling typical sunny day characteristics. In contrast, Weather Patterns II and III reflect a more fluctuating distribution, akin to the conditions observed on cloudy and rainy days. Notably, because K-shape clustering organizes the PV output curves based on their shape information, some curves with lower outputs but a flatter trend—compared to those in Patterns II and III—are also classified into Weather Pattern I. The K-shape clustering algorithm proves to be a straightforward and efficient method for organizing the PV output curves, facilitating a better understanding of how different weather conditions affect photovoltaic performance.

To verify the superiority of the K-shape clustering algorithm used in this article, the results obtained from K-shape clustering were compared with the average contour coefficients of other clustering algorithms, including K-means and K-medoids. The K-shape clustering algorithm used in this article has the highest contour coefficient and, therefore, has the best clustering performance.

To verify the superiority of the DTW algorithm used in this article, the selection of DTW similarity days was compared with other distance measurement methods, including Euclidean distance, Manhattan distance, and cosine distance. The experimental results show that the DTW algorithm used in this article has the lowest prediction error and, therefore, has the best ability to analyze similar days.

To demonstrate the effectiveness of the prediction method following the clustering of weather patterns proposed in this study, we compare the predicted results from various models after categorizing the weather days, as shown in Table 2 and Table 3. The comparison involves several methodologies, including long short-term memory (LSTM), Backpropagation Neural Network (BPNN), and Wavelet Neural Network (WNN). The hyperparameters of each model are optimized by the CRSSA algorithm proposed in this study. The process begins with calculating the similarity measure between the day to be predicted and the defined clusters. This is followed by training on sample clusters that exhibit higher similarity, effectively screening the data from the perspective of the inputs. This approach contributes to enhancing the prediction accuracy for each model. The results indicate that the proposed method can, at most, achieve a reduction of 1.2% in the MAPE and a decrease of 1.47 in the RMSE when compared to the LSTM model. This suggests that the proposed model significantly improves the accuracy of PV power predictions. Furthermore, the prediction accuracy of the LSTM model surpasses that of the traditional BPNN and WNN models, highlighting the efficacy of LSTM’s temporal memory function for addressing prediction challenges. Additionally, the prediction performance of the model after applying CNN feature extraction is superior to that of the original LSTM model, confirming that the CNN effectively extracts relevant features, contributing to improved predictive accuracy.

To illustrate the superiority of the prediction model proposed in this study, we present the results of typical daily predictions across various weather patterns from the test set, as depicted in Figure 6, Figure 7 and Figure 8. The analysis reveals that the CNN-LSTM model developed in this study achieves the highest prediction accuracy among the models evaluated. Notably, the fluctuations in PV power under Weather Mode 1 are minimal, and the prediction results from all models closely align with the actual values. This observation indicates that the model effectively learns the weather fluctuation characteristics associated with sunny conditions. Furthermore, the prediction accuracy in the context of Weather Mode 1 is significantly greater than that for Weather Modes 2 and 3, highlighting the impact that varying weather conditions have on prediction accuracy. If historical data corresponding to Weather Modes 2 or 3 are used as the training set for Model 1, this could adversely affect its prediction accuracy. This finding corresponds with the results detailed in Table 1 and Table 2, further validating the reasonableness and effectiveness of the proposed method.

To demonstrate the effectiveness of the hyperparameter optimization scheme proposed in this study, the Nanjing area in China is used as a case study. In this analysis, the manually adjusted hyperparameters serve as the baseline (denoted as “Base”). Various optimization algorithms, including ISSA, SSA, PSO, and Ant Colony Optimization (ACO) algorithm, are employed to optimize the hyperparameters for the CNN-LSTM prediction model. The resulting error indicators from these optimizations are presented in Table 4 and Table 5, and the convergence curves are depicted in Figure 9, Figure 10 and Figure 11. It is important to note that the objective function used in the iterative process is the average value derived from K-fold cross-validation of the training set. The results indicate that employing intelligent optimization algorithms for hyperparameter tuning significantly enhances the prediction performance of the model. Among the different optimization methods evaluated, the proposed CRSSA exhibits the best convergence performance. It achieves the lowest values for the error metrics MAE and MAPE in predicting PV power.

6. Conclusions

In this study, a distributed photovoltaic (PV) power prediction method considering clustering to classify weather types is proposed and validated through simulation with measured data from distributed PV power plants in a region of Nanjing, China. The findings of this study are summarized as follows: (1) The integration of multiple suitable models significantly enhances prediction accuracy. Specifically, the combination of CNN and LSTM networks improves performance compared to using either CNN or LSTM alone. (2) The careful setting of hyperparameters is crucial for the model’s performance; appropriate hyperparameter settings can significantly improve the accuracy of the model. Experimental analysis demonstrates that the proposed hyperparameter optimization method can reduce the RMSE by up to 1.629 and the MAPE by 3.4%. (3) The approach of considering clustered weather conditions and selecting sample clusters with high similarity to the day being predicted for training purposes can effectively improve prediction accuracy. This method yields an average reduction in RMSE of 0.546 and a decrease of 2% in MAPE. However, this method also has certain limitations. When dealing with extreme or complex weather conditions, there may be limitations in predicting accuracy due to the scarcity of similar daily samples. In the future, efforts can be made to optimize prediction models to better adapt to and improve prediction accuracy and reliability in complex weather conditions.

Author Contributions

Conceptualization, X.H.; validation, L.W.; methodology, L.G.; formal analysis, L.H.; writing—original draft preparation, T.D.; writing—review and editing, Y.Z.; funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources, grant number LAPS23018, National Natural Science Foundation of China, grant number 52277118 and National Science and Technology Major Project, grant number 2022ZD0116900.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Xu, X.; Wang, H.; Yan, Z.; Lu, Z.; Kang, C.; Xie, K. Overview of Power System Uncertainty and Countermeasures in the Context of Energy Transformation. Autom. Electr. Power Syst. 2021, 45, 2–13. [Google Scholar]
Chen, G.; Wu, W.; Dai, Z.; Xu, X.; Zhang, Y.; Hao, S. Multi objective optimization of regional reactive power reserve of hybrid wind solar power storage system considering fault scenario set. Autom. Electr. Power Syst. 2022, 46, 194–204. [Google Scholar]
Wen, X.; Chen, X.; Zhang, A.; Zhong, H.; Guo, L.; Huang, M.; Yan, W. Market risk management and control method of wind solar hybrid micro grid random dispatching with adjustable load. Power Syst. Technol. 2021, 45, 4308–4318. [Google Scholar]
Li, B.; Li, M.; Liu, M. Photovoltaic power generation prediction model based on path analysis and phase space reconstruction. Electr. Meas. Instrum. 2022, 59, 79–87. [Google Scholar]
Yao, H.; Du, X.; Qin, W. Photovoltaic power generation power prediction method based on density peak clustering and GRNN neural network. Acta Energiae Solaris Sin. 2020, 41, 184–190. [Google Scholar]
Zang, H.X.; Cheng, L.L.; Ding, T.; Cheung, K.W.; Liang, Z.; Wei, Z.; Sun, G. Hybrid method for short-term photovoltaic power forecasting based on deep convolutional neural network. Iet Gener Transm Dis 2018, 12, 4557–4567. [Google Scholar] [CrossRef]
Gao, M.M.; Li, J.J.; Hong, F.; Long, D. Day-ahead power forecasting in a large-scale photovoltaic plant based on weather classification using LSTM. Energy 2019, 187, 115838. [Google Scholar] [CrossRef]
Zheng, J.; Zhang, H.; Dai, Y.; Wang, B.; Zheng, T.; Liao, Q.; Liang, Y.; Zhang, F.; Song, X. Time series prediction for output of multi-region solar power plants. Appl. Energy 2020, 257, 114001. [Google Scholar] [CrossRef]
Eseye, A.T.; Zhang, J.H.; Zheng, D.H. Short-term photovoltaic solar power forecasting using a hybrid Wavelet-PSO-SVM model based on SCADA and Meteorological information. Renew. Energy 2018, 118, 357–367. [Google Scholar] [CrossRef]
Wu, X.; Wang, Z.; Dai, W.; Zhao, Z.; Guo, S.; Zhang, D. Dual integrated photovoltaic power generation power prediction based on heterogeneous clustering and stacking. Grid Technol. 2023, 47, 275–284. [Google Scholar]
Wang, K.; Du, H.; Jia, R.; Liu, H.; Liang, Y.; Wang, X. Short term interval probability prediction of photovoltaic power based on similar day clustering and QR-CNN-BiLSTM model. High Volt. Technol. 2022, 48, 4372–4388. [Google Scholar]
Li, Q.; Zhang, X.; Ma, T.; Liu, D.; Wang, H.; Hu, W. A Multi-step ahead photovoltaic power forecasting model based on TimeGAN, Soft DTW-based K-medoids clustering, and a CNN-GRU hybrid neural network. Energy Reports 2022, 8, 10346–10362. [Google Scholar] [CrossRef]
Li, K.; Jiang, Y.; Huang, S.; Li, J.; Yang, M. Low voltage topology identification method of distribution station area based on DTW distance and cluster analysis. Power Syst. Prot. Control 2021, 49, 29–36. [Google Scholar]
Yang, L.; Zhang, Z. A Deep Attention Convolutional Recurrent Network Assisted by K-Shape Clustering and Enhanced Memory for Short Term Wind Speed Predictions. IEEE Trans. Sustain. Energy 2022, 13, 856–867. [Google Scholar] [CrossRef]
Zhang, A.; Duan, X.; He, X. A new wind power prediction model based on CNN and LightGBM. Electr. Meas. Instrum. 2021, 58, 121–127. [Google Scholar]
Zhao, C.; Pian, R.; Du, T.; Ge, L. Power quality trend prediction analysis model for important users based on LSTM. Proc. CSU-EPSA 2022, 34, 26–33. [Google Scholar]
Chen, C.; Zhao, X.; Bi, G.; Xie, X.; Gao, J.; Luo, C. Short term wind speed prediction model based on multi-mode decomposition and sparrow optimized residual network. Power Syst. Technol. 2022, 46, 2975–2985. [Google Scholar]
Lou, A.; Yao, M.; Jia, W.; Yuan, D. Tent Chaos and GSA for Local Search Optimization in Variable Neighborhood. J. Xidian Univ. 2019, 46, 120–127. [Google Scholar]

Figure 1. DTW schematic diagram.

Figure 2. LSTM structure diagram.

Figure 3. CNN-LSTM structure diagram.

Figure 4. CRSSA-CNN-LSTM forecast flow chart.

Figure 5. Contour coefficient values for various cluster numbers.

Figure 6. Different weather patterns for cluster partitioning.

Figure 7. Comparison of different clustering methods.

Figure 8. Model 1 typical day forecast results for each model.

Figure 9. Model 2 typical day forecast results for each model.

Figure 10. Model 3 typical day forecast results for each model.

Figure 11. Convergence curves of different optimization algorithms.

Table 1. The correlation coefficient between various meteorological factors and output power.

Meteorological Factor	Pearson’s Coefficient	Spearman’s Coefficient	MIC Coefficient
Air velocity	0.26	0.28	0.08
Pneumatic	0.19	0.23	0.09
Irradiance	0.98	0.92	0.94
Relative humidity	0.37	0.39	0.15
Environmental Temperature	0.41	0.44	0.14

Table 2. Comparison of distance measurement methods.

Algorithm	$y_{RMSE}$	$y_{MAPE}$	$y_{R^{2}}$
K-shapes	7.230	0.062	0.983
Euclidean distance	8.463	0.076	0.955
Cosine Distance	7.894	0.071	0.968
Manhattan Distance	8.872	0.082	0.942

Table 3. Prediction accuracy of each model with undivided weather.

Model	$y_{RMSE}$ (kW)	$y_{MAPE}$	$y_{R^{2}}$
CNN-LSTM	7.518	0.087	0.973
BPNN	10.931	0.112	0.927
LSTM	9.048	0.091	0.958
WNN	9.866	0.105	0.934

Table 4. Prediction accuracy of each model after clustering and dividing weather.

Model	$y_{RMSE}$	$y_{MAPE}$	$y_{R^{2}}$
CNN-LSTM	7.230	0.062	0.983
BPNN	10.336	0.088	0.942
LSTM	8.638	0.074	0.975
WNN	8.976	0.091	0.953

Table 5. CNN-LSTM prediction error indicators with different optimization algorithms.

Error Indicator	Base	CRSSA	SSA	PSO	ACO
RMSE	8.859	7.230	7.461	7.813	7.544
MAPE	0.097	0.063	0.071	0.084	0.075

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, X.; Wang, L.; Ge, L.; Hou, L.; Du, T.; Zheng, Y.; Chen, Y. A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction. Electronics 2024, 13, 3536. https://doi.org/10.3390/electronics13173536

AMA Style

Huang X, Wang L, Ge L, Hou L, Du T, Zheng Y, Chen Y. A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction. Electronics. 2024; 13(17):3536. https://doi.org/10.3390/electronics13173536

Chicago/Turabian Style

Huang, Xu, Leying Wang, Leijiao Ge, Luyang Hou, Tianshuo Du, Yiwen Zheng, and Yanbo Chen. 2024. "A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction" Electronics 13, no. 17: 3536. https://doi.org/10.3390/electronics13173536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Short-Term Power Prediction Method for Photovoltaics Based on Similar Day Clustering and Spatio-Temporal Feature Extraction

Abstract

1. Introduction

2. Similar Day Selection Based on K-Shape and DTW

2.1. Dataset Description

2.2. Data Preprocessing

2.3. K-Shape Clustering Algorithm

2.4. Dynamic Time Warping (DTW)

2.5. Similarity Day Selection Process

3. Distributed PV Power Prediction Model

3.1. Convolutional Neural Networks

3.2. Long- and Short-Term Memory Networks

3.3. CNN-LSTM Hybrid Neural Network Model

4. Hyperparameter Adaptive Tuning

4.1. Sparrow Optimization Algorithm

4.2. Improvement Programmes

4.3. Objective Functions and Constraints

4.4. Distributed PV Power Prediction Process

5. Experiment Analysis

5.1. Data

5.2. Indicators

5.3. Feature Selection

5.4. Cluster Number Analysis

5.5. Analysis of Forecast Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI