1. Introduction
Solar photovoltaic energy (SPE) is positioned as an energy source that contributes significantly to the diversification of the world’s energy matrix. According to statistics from the International Renewable Energy Agency (IRENA), in 2010 the installed capacity of photovoltaic systems (PVS) was 40 GW and in 2020 this installed capacity worldwide reached 714 GW, which represents an approximate increase of 94% in the last decade [
1]. In South America, according to IRENA, in 2010 the installed capacity of PVS was 43 MW and in 2020 this number increased to 12 GW, representing an approximate increase of 99% in the region [
1].
In the face of this growth, there are several challenges to consider in terms of the high penetration rates of PVS, being that this type of energy generation varies with the existence of a maximum generation limit that changes over time, from seconds to years [
2], which is known as variability. In addition, this limit is not known with perfect precision, which is called uncertainty or error. The movement around the sun generates variability that can be predicted, while the variability associated with clouds can be difficult to predict, as well as the uncertainty due to difficulties in forecasting the behavior of weather conditions.
SPE generation forecasts are fundamental to face the challenges brought by variability and uncertainty. They are also a great tool for the management of electricity grids, their security and the commercialization of solar energy [
3]. There are works that apply forecasts to improve the performance of the electricity system that consider demand and prices as variables to be predicted [
4,
5,
6]. Both generators and suppliers that work with PV systems require forecasts for making operational and planning decisions [
5]. Given, also, the high dependence of the SPE on weather conditions, its nature is unstable and can affect the reliability and quality of the power grid, causing frequency and voltage fluctuations [
7].
Considering this context, the main objective of this work is to apply, investigate and evaluate two fuzzy time series (FTS) methods in the short-term generation prediction of SPE with historical data from a single database, obtained in Florianopolis, Santa Catarina, Brazil. The database was provided by the Photovoltaic Laboratory [
8] by personal e-mail communication and can be accessed at the GitHub repository [
9]. As reported in [
10], there are still few studies in the literature conducted in Latin America countries.
Considering the main objective presented, the specific objectives are defined: (1) to compare the global horizontal irradiance (GHI) forecast accuracy of two FTS methods, the first order weighted method with chronological weights and another higher order method working with fuzzy information granules (FIG), which have different learnings; (2) subsequently, to evaluate the use of FTS in a power simulation using the spatial smoothing method [
11]; and (3) transversal to the first two objectives, we intend to compare the performance of the FTS with three different short-term prediction horizons, being 5, 15 and 30 min.
The FTS excels in allowing system flexibility by considering natural circumstances while dealing with vague and imprecise knowledge in time series data [
12]. Among the possible tools, the FTS methods are implemented through the Python library pyFTS [
13] which develops the steps of the method proposed by [
12] for forecasting with FTS. This library is the result of a work of the MINDS laboratory (Machine Intelligence and Data Science laboratory) that researches computational intelligence and machine learning, optimization, data visualization and decision making [
14].
Considering one of the most comprehensive literature reviews reported in [
15], significant progress has been seen in PV power generation forecasting, especially in recent years with the use of machine learning and deep learning methods. However, relatively few studies applying FTS are observed in this research field. The study described in [
16] presents a database of irradiance data recorded at 30 min intervals which is used to generate forecasts using two FTS methods, showing optimal performance when compared to other forecasting methods. Other work has developed a fuzzy logic model for short-term forecasting that considers one hour ahead of solar energy production [
17]. In the studies [
18,
19], fuzzy logic is applied to make short-term load forecasts. More recently, the following highlighted studies are [
20,
21] where the FTS method is improved with probabilistic forecasting and the information granule method, and is proposed in order to simplify the process with multivariate models.
Publications showing the progress of the study of FTS are found, such as the one in [
22] where the non-stationary fuzzy time series (NSFTS) is introduced, which is able to dynamically adapt its fuzzy sets to reflect changes in the underlying stochastic processes based on residual errors. There are works, such as the short-term forecasting method based on the Takagi-Sugeno (T-S) fuzzy model for wind power and wind speed, where the results show that the proposed T-S fuzzy model can effectively improve the accuracy of the short-term forecasting of wind power [
23]. It is observed in [
21] how it is possible to generate the interval forecasts, from which it is possible to construct a cumulative density function and use it to build the quantile function and probabilistic forecasts with the treatment of stochastic simulations or ensembles. The interval forecasts deal with the drawbacks of point forecasts and this is also discussed in [
24], where the probability distribution problem is addressed using the Kernel density estimation. Stacked modules of the deep fuzzy model (DIRM-DFM) for accurate prediction have been found that show the current progress of fuzzy models [
25]. Another topic that shows the progress in the study of FTS is dealt with in [
26]; here, a hyperparameter optimization method for high order weighted FTS that automates the generation of accurate and parsimonious models using genetic algorithms is presented.
The FTS method is used in conjunction with other prognostic methods and interesting results are evident from this coupling. In [
27], the FTS and convolutional neural networks (CNN) are combined for short term load forecasting; they report good efficiency of the method. Similar results are reported in [
28], where the proposal is to improve a solar forecasting model based on an artificial neural network (ANN) with fuzzy logic preprocessing. The study reported in [
18] highlights the ability of FTS methods to deal with sudden variations in temperature, considering their influence on PVS, in a simple and robust way.
In this context, this study evaluates the performance of the FTS methods combined with the spatial smoothing method to solve the SPE generation forecasting problem considering the spatial dimension and characteristics of a specific photovoltaic system. For this evaluation we use the compilation made by [
29] where three criteria are recommended to measure the accuracy of the model: the overall bias, the dispersion and the ability to reproduce statistical distributions. The most recommended metrics to quantify these criteria are, respectively, the mean bias error (MBE), the root mean square error (RMSE) and the Kolmogorov-Smirnov (KS) test. In addition, the coefficient of determination (R
2), which expresses the fit of the predicted model data to the original data, will be used [
15].
This work presents in
Section 2 the theoretical foundation, where the forecasting approaches are detailed and the FTS method is exposed, how the fuzzy relationships were created and, from this, how the forecasts are generated.
Section 3 describes the model implementation process, the used database, the hyperparameters implementation and the learning process of each method.
Section 4 shows the results obtained from the radiation predictions and the subsequent power simulation, as well as a discussion of the results obtained. Finally, the conclusions are described in
Section 5.
5. Conclusions
This study developed two multivariable fuzzy time series (FTS) methods and evaluated their use in the indirect prediction of short-term photovoltaic power generation. In addition, the application of the FTS methodology added to the spatial smoothing for power simulation (direct prediction) allows for a controlled experimental setup, enabling the monitoring of the whole process, mainly how the learning of each of the models occurs and the creation of the fuzzy rules, as shown in
Figure 4.
Each of the methods used, both WEIGHTED-FTS and FIG-FTS, was assigned the GHI as endogenous variables and the exogenous variables of the minute, hour, day, date and ambient temperature worked through the forecast time horizons of 5, 15 and 30 min. Although there are works where the FTS methods are applied, a contribution of this study is to have used the spatial smoothing method with the application of a low-pass filter, added to the method as post-processing of the data, to simulate power values where the particular characteristics of PVS are considered. For analysis purposes, data obtained from the 2.2 kWp SFV of the solarimetric station installed in the Photovoltaic laboratory of the Federal University of Santa Catarina (UFSC), located in the city of Florianópolis in Brazil, were used.
In the comparative analysis of both methods, it was found that the FIG-FTS method, of the higher order, provides better results in the GHI forecast through the 5 and 15 min horizons, which can be perceived in the statistical results of
Table 7 and in the analysis of the KS statistic representing the model’s capacity to reproduce the cumulative distribution function of the observed data. However, the GHI values predicted with the FIG method at the 30 min horizon (the widest of the horizons tested), presents underestimation, i.e., MBE < 0 of the predicted values. At the 30 min time horizon, the best statistics are observed when applying the WEIGHTED-FTS method, as can be seen in
Figure 6. This indicates that for longer prediction horizons, first order methods are indicated as more effective.
Once the power simulation was performed with the low pass filter, considering the SFV specifications, it was evidenced that the coupling of the FIG method results in better statistical indexes. It can also be perceived from
Table 8 that shorter time horizons, such as 5 min, improve the values obtained from statistics, such as RMSE and KS test values. The statistical analysis shows higher error values, as well as the inability to reproduce the statistical distribution of the samples in the data obtained in the power simulation. This is due to the fact that the application of the low pass filter, by smoothing the fast irradiance peaks, also eliminates irradiance values with large variations typical of this type of measurement.
Although the higher order method has better results, it is necessary to emphasize that increasing the order value of the method indiscriminately does not imply an increase in its performance. This is because the higher the order, the more fuzzy sets will be generated, and too many fuzzy sets generate an overfitting, causing the model to start learning the noise of the data; similarly, lower order sets generate an underfitting, due to the oversimplification of the signal [
45].
It is suggested, based on the results, that both FTS methods applied can be used in PV energy generation forecasting in the evaluated short-term horizons, with the best accuracy depending on the prediction horizon. In addition, the direct prediction produced higher errors than the indirect prediction for both FTS methods analyzed. It is highlighted that the implementation of the FTS through the Python library, pyFTS, is a reliable process since it allows access to its source code. In this work, point forecasting is used but its performance can be improved by including, in addition to hyperparameter optimization, interval prediction, with which the fuzzy method can be heuristically simple and fast without generating a large computational demand for the probability density function [
24]. Furthermore, to ensure good performance of the FTS forecasting process, the use of a database with good integrity is required. The database provided by the Photovoltaic Laboratory of the Federal University of Santa Catarina was of vital importance.
Finally, another original contribution of this study is that the prediction results have two simultaneous approaches: direct and indirect prediction, since this approach has not been observed on solar photovoltaic energy forecasting literature [
15,
30,
47]. In addition, this study allowed the evaluation of two FTS methods with different experimental setups and their performance when used together with spatial smoothing applied with the low-pass filter. Future work will be directed at comparing the accuracy of FTS methods with benchmark models and other machine learning methods, as well as forecasting with FTS hyperparameter optimization [
26] and probabilistic forecasts considering the seasonal components of the input variables [
21].