Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support

Nicoletti, Francesco; Bevilacqua, Piero

doi:10.3390/en17020466

Open AccessFeature PaperArticle

Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support

by

Francesco Nicoletti

^1,2,*

and

Piero Bevilacqua

¹

Department of Mechanical, Energy and Management Engineering (DIMEG), University of Calabria, Via P. Bucci, 87036 Rende, Italy

²

Department of Electrical Electronic and Computer Engineering (DIEEI), University of Catania, Viale A. Doria 6, 95125 Catania, Italy

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(2), 466; https://doi.org/10.3390/en17020466

Submission received: 13 December 2023 / Revised: 13 January 2024 / Accepted: 16 January 2024 / Published: 18 January 2024

(This article belongs to the Special Issue Decisions and Market Analysis for Solar Energy)

Download

Browse Figures

Versions Notes

Abstract

:

The day-ahead photovoltaic electricity forecast is increasingly necessary for grid operators and for energy communities. In the present work, the hourly PV production is estimated using two models based on feedforward neural networks (FFNNs). Most existing models use solar radiation as an input. Instead, the models proposed here use numerical weather prediction (NWP) data: ambient temperature, relative humidity, and wind speed, which are easily accessible to anyone. The first proposed model uses multiple inputs, while the second one uses only the necessary information. A sensitivity analysis allows for the identification of the variables that are most influential on the estimation accuracy. This study concludes that the hourly temperature trend is the most important variable for prediction. The models’ accuracy was tested using experimental and NWP data, with the second model having almost the same accuracy as the first despite using fewer input data. The results obtained using experimental data as inputs show a coefficient of determination (R²) of 0.95 for the hourly PV energy produced. The RMSE is about 6.4% of the panel peak power. When NWP data are used as inputs, R² is 0.879 and the RMSE is 10.5%. These models can have a significant impact by enabling individual energy communities to make their forecasts, resulting in energy savings and increased self-consumed energy.

Keywords:

PV forecast; artificial neural network; photovoltaic; weather forecast

1. Introduction

Today, photovoltaics stands as one of the most crucial technologies for achieving green energy production goals and advancing towards a sustainable future. However, the inherent variability of solar energy poses a significant challenge when it comes to planning energy consumption. Predicting energy production from photovoltaics for the following day is a critical task that offers two substantial advantages in effective energy management:

-: Consumer and energy communities can optimize their electricity usage based on forecasted energy availability. For instance, a consumer can strategically time the operation of appliances to reduce reliance on the grid, taking into account dynamic energy prices [1,2]. Various control systems are instrumental in scheduling electrical loads to ensure that they stay within the installed power capacity, and this coordination is significantly enhanced when integrated with a production forecasting system.
-: Photovoltaic electricity production forecasting aids grid operators in planning energy distribution. The erratic nature of energy generated from renewable sources poses a challenge to maintaining grid frequency stability. Having prior knowledge of these fluctuations is increasingly crucial, especially as renewables are expected to contribute a larger share of the energy supply in the near future.

The prediction of energy generation can be categorized into short-term and long-term forecasts, with the prediction for the next day falling into the former category. These forecasting models are grouped based on the methodologies that they employ, with the most prominent categories being statistical, physical, and artificial intelligence (AI) methods.

The statistical approach involves seeking mathematical formulations that establish connections between input variables and electricity production. A widely used method within this category is the autoregressive moving average (ARMA) [3]. Wang et al. [4], on the other hand, used partial functional linear regression models. Although other techniques have evolved from ARMA [5,6], these approaches often provide less reliable results when addressing sudden changes in solar radiation. The inherent rapid fluctuations are not adequately captured by statistical methods.

Physical models, on the other hand, are based on equations that enable thermal and electrical modeling of photovoltaic panels. Various studies have proposed models capable of predicting both PV energy production and panel temperature [7]. These models are often built upon energy balance methods [8,9], which may utilize one- [10] or two-dimensional approaches [11]. Additionally, computational fluid dynamics (CFD) models have been introduced [12]. However, these methods require real-time access to climatic variables to accurately assess the thermoelectric behavior of PV panels, and such precise data are often unavailable for forecasting purposes, particularly for the following day.

The objective of this study was to employ an artificial intelligence approach to predict photovoltaic production. Artificial intelligence has assumed a pivotal role as a predictive tool in various applications, with a significant focus on solar energy production. Machine learning techniques, particularly diverse neural networks, have garnered extensive attention in the recent literature for forecasting PV production [13,14,15]. For instance, Pedro et al. [16] conducted a comparative analysis of several forecasting methods to evaluate their accuracy in predicting solar power output from a 1 MWp, single-axis tracking photovoltaic power plant in California. Their findings concluded that artificial neural network (ANN) models outperformed other methods.

In the context of monthly solar power output forecasting, a method employing seasonal decomposition and least-squares support-vector regression has been proposed [17]. An ANN has been integrated with data processing, input variable selection, and external optimization techniques to forecast PV system power output [18]. Furthermore, an ANN has been indirectly utilized for PV power prediction through solar irradiance forecasting [19]. A multilayer perceptron (MLP) model was suggested to forecast 24 h solar irradiance based on daily solar irradiance and air temperature data from an experimental database. The study included a practical application comparing the actual power output from a rooftop PV plant in the Municipality of Trieste with the power calculated using 24-h-ahead solar irradiance forecasts.

In the Republic of Korea, an ANN was employed to model urban energy supply plants and renewable energy availability, integrating energy-related legal regulations, standards, and energy plant facilities into an energy geographic information system database [20]. Forecasting power generation 24 h in advance using a radial basis function network (RBFN) was proposed in [21]. This technique directly forecasts PV systems’ power output using historical records and real-time meteorological data. A recurrent neural network was introduced to predict PV power in a peak zone without relying on future meteorological forecasts, solely using PV power outputs and morning meteorological observations [22]. In another study, a seven-parameter electrical model and a feedforward neural network were cascaded to test multicrystalline PV panels’ performance, achieving mean bias error deviations of less than ±1% [23]. A recurrent neural network model with long short-term memory was developed to recognize temporal patterns in data collected from 164 PV sites over 63 months, including weather conditions and estimated solar irradiation. The model achieved a normalized root-mean-square error of 7.416% and a mean absolute percentage error of 10.8% [24]. Almonacid et al. [25] proposed a methodology for forecasting PV output one hour ahead using a dynamic artificial neural network. This approach employed two ANNs for predicting weather variables (solar irradiance and air temperature) and a third ANN to estimate the output power of a PV module. A fourth ANN incorporated the output of the preceding ANN and the PV configuration to provide final forecast values. Additionally, a combination of a linear regression model and an ANN was utilized to predict the performance of soiled PV modules using solar irradiation and ambient temperature [26]. Notably, an ANN has also been applied to suggest an active cooling algorithm based on fan cooling for the back surface of PV panels [27]. At the University of Malaya, an extreme learning machine (ELM) algorithm was developed to forecast the maximum power point tracking (MPPT) of three grid-connected PV plants, considering forecasting horizons of 1 h and 1 day ahead [28].

The majority of the models mentioned in the previous discussion utilized solar radiation as their primary input data and achieved highly satisfactory results. However, in practice, obtaining accurate solar radiation information as an input in advance using forecast models can be challenging. On the other hand, numerous studies focus on predicting solar radiation and subsequently employ a physical model to estimate photovoltaic output. Undoubtedly, solar irradiance plays a pivotal role in determining the electrical performance of PV panels. Nevertheless, the electricity production is not solely defined by solar irradiance. The conversion efficiency also relies on the cell temperature, which, in turn, is influenced by various boundary conditions. A comprehensive analysis necessitates that the neural network directly provides the electrical output, allowing it to factor in the panel’s conversion efficiency.

In this study, two models are proposed, both based on artificial neural network (ANN) technology, with the goal of predicting the power output of a silicon photovoltaic module. In a departure from the existing literature, these models predict the PV module’s performance without relying on solar radiation data as inputs. The aim is to perform the forecast using the numerical weather prediction (NWP) data that are easily accessible through websites. Specifically, this relies on hourly temperature, relative humidity, and wind speed data. These models have the potential to empower individual energy communities to create their own forecasts. This approach holds significant promise, as it only requires standard meteorological data for each location. Furthermore, this work offers insights into determining the optimal number of neurons for the neural network architecture and ranks the most critical information for short-term forecasting. This information can serve as a foundational reference for future research endeavors aimed at exploring this problem further.

The remainder of this paper is structured as follows: Section 2 introduces the applied methodology, presenting two distinct models, both starting from the hourly values of three selected quantities. The first model incorporates numerous additional inputs derived from daily processing of these hourly values, while the second model relies solely on the essential information. In Section 3, the study’s outcomes are revealed. This section outlines the defined network architectures and presents the results of tests conducted using both experimental data and NWP data as inputs. Additionally, it includes a sensitivity analysis aimed at identifying the most critical variables in the models.

2. Materials and Methods

2.1. The Proposed ANN Models

The proposed ANN models were developed with the objective of predicting the hourly electricity production of a photovoltaic panel. Figure 1 provides a summary diagram of the investigation. The selected inputs should be readily obtainable from numerical weather forecasts; thus, the models rely on three crucial variables: hourly air temperature, relative humidity, and wind speed. The initial step involves configuring how these inputs are fed into the ANN. Data preprocessing is performed on a daily scale to generate additional input variables. Two models are introduced here for this purpose. The first model, referred to as Model1, incorporates all of the selected inputs, encompassing both hourly and daily data. Training and validation were conducted using experimental data collected within the laboratory of the Department of Mechanical Engineering, Energy, and Management. These phases aimed to identify the optimal neural network architecture and assess the quality of the prediction results. The other model, designated as Model2, explores the use of a reduced number of inputs. Model2 is derived from Model1 by systematically excluding one input variable at a time, allowing for an evaluation of the importance of each input variable. Similar to Model1, training and validation were carried out to determine the most effective network architecture. Subsequently, both networks underwent testing using experimental data collected during different seasons, including summer, spring, and winter. To assess the networks’ stability and their ability to cope with potential errors associated with each input variable, a sensitivity analysis was conducted. Finally, a comprehensive evaluation was undertaken by utilizing NWP data as inputs for the models.

2.2. Artificial Neural Network

The artificial neuron is the basic element of a neural network. It functions in a similar way to a biological neuron, which generates an electrical impulse that propagates along the axon (i.e., the output of the neuron) only if the electrical potential of the neuron exceeds a certain threshold. Similarly, the artificial neuron analyses the intensity of each input, comparing it with a reference value (bias), and provides the output using an activation function. The data are then multiplied by a weight and reach another neuron as inputs. In mathematical terms, the output

o_{k}

of neuron k can be modeled with the following expression:

o_{k} = φ (b_{k} + \sum_{j = 1}^{n} w_{k j} x_{j})

(1)

where

φ

represents the activation function,

b_{k}

is the bias of the neuron,

n

is the number of inputs to the neuron,

x_{j}

is the input, and

w_{k j}

is the weight assigned to each input through the synaptic connections. In this study, the activation function used is the sigmoid function:

φ (z) = \frac{1}{1 + e^{- z}}

(2)

This trigger function is often used in typical ANN applications and allows nonlinearity to be introduced into the overall input–output link. The architecture of a feedforward neural network (FFNN) consists of numerous neurons arranged in input layers, hidden layers, and output layers. Each neuron of a layer is interconnected with all of the neurons of the next layer. The designer of a network has the task of identifying the numbers of hidden layers and neurons. Most scientific articles suggest a trial-and-error approach, and some articles suggest starting values for various attempts. The solution with only one hidden layer makes it possible to solve most problems with high accuracy. For the number of neurons to be used, Boger et al. [29] suggest starting with 70−90% of the number of input neurons.

2.3. Selection of Input Data

The input variables are temperature, relative humidity, and wind speed, provided on an hourly basis by the websites for the following day. To infer the predictability of the day ahead, the neural network is thus equipped with 24 values for each variable.

These input variables are able to provide valuable insights for predicting the electricity production. For instance, relative humidity tends to be higher on rainy or overcast days, while air temperature tends to be higher on sunny days. Wind speed plays a role in the heat transfer of the photovoltaic panel and affects its efficiency. Furthermore, it acts as an indicator that encapsulates information about the variations in atmospheric conditions due to pressure gradients, which can swiftly alter sky cover. All of these aspects are indicative of solar irradiance and also impact the cell temperature, a parameter with a direct influence on the conversion efficiency of the photovoltaic module. Therefore, these inputs all have a direct or indirect influence on electrical predictability. Additional essential inputs include the same data processed on a daily basis. For instance, information regarding the minimum and maximum daily temperatures provides the ANN with a fixed reference point for hourly temperature values. Moreover, the largest daily temperature range typically signifies a clear day. The minimum and maximum relative humidity values are also critical; when the minimum relative humidity is close to 100%, it often indicates a high likelihood of rainy, overcast, or foggy conditions throughout the day. All of the selected input data that could influence PV electricity production are outlined in Table 1. The initial eight variables represent daily data and aid in categorizing the overall type of day, whether it is clear, cloudy, partly cloudy, etc. The remaining four variables are measured hourly and assist in understanding how electricity production is distributed over time. All of these data, in conjunction with the electrical output of a photovoltaic panel, are collected using an experimental setup and employed for training the neural network.

2.4. Model1

The first model, referred to as Model1, incorporates all of the selected inputs. The structure of Model1 is depicted in Figure 2, composed of two neural networks (ANN1 and ANN2). ANN1 processes daily data, while ANN2 deals with hourly data. The final output is the hourly PV energy produced (

H P E

), which is computed using the ANN2 network. This network also utilizes the daily PV energy production (

D P E

) as input information. This is particularly useful in estimating hourly energy production, as ANN2 can distribute the daily energy across timeslots with the aid of hourly temperature, relative humidity, and wind speed profiles. The

D P E

is estimated by ANN1, which exclusively operates with daily data. Specifically, ANN1 relies on the following data:

D o Y

,

{R H}_{m i n}

,

{R H}_{m a x}

,

T_{m i n}

,

T_{m a x}, T_{a v g}

,

{R H}_{a v g}

, and

{w s}_{a v g}

. On the other hand, ANN2, in addition to using

D P E

and the hourly data, also includes the minimum and maximum temperature and the minimum and maximum relative humidity. Finally,

H P E

represents the energy production for the current hour. To provide ANN2 with additional information, data from not only the current timestep but also the preceding and subsequent timesteps are included as inputs. This additional information helps the network account for potential fluctuations in production relative to neighboring timesteps.

2.5. ANN Training Procedure

The artificial neural networks were trained to understand the connection between atmospheric variables and the hourly electrical energy production. To achieve this, the networks underwent training using a substantial volume of experimental data collected at the University of Calabria (Latitude: 39°21′ N; Longitude: 16°13′ E). The experimental setup was positioned on the rooftop of a building within the Department of Mechanical, Energy, and Management Engineering at the University of Calabria. This setup consisted of a south-oriented PV module affixed to a metallic structure, inclined at 30°. The PV module was constructed with polycrystalline silicon cells, measuring 1663 mm × 998 mm, with a total area of 1.46 m². It boasted a nominal efficiency of 14.5%, a nominal power output of 245 W, and a NOCT of 43 °C. The DC/AC conversion was facilitated by a micro-inverter equipped with maximum power point tracker. Meteorological and climatic conditions were closely monitored by an integrated weather station, with the sensor specifications provided in Table 2.

The data were collected over the years 2017, 2018, and 2019, with a one-minute timestep. However, since such a level of granularity is not necessary, the data were processed by computing hourly and daily averages. Data recorded in August, October, and December 2018 served as the validation set, while data from January, April, and July 2019 constituted the test set. The remaining data were employed for training. In total, the model had access to 867 days of data for learning the potential relationships between the hourly output and the input variables. A subset of 93 days was allocated for validation, and another 92 days was designated for testing. The training was conducted using MATLAB R2019a software, which randomly further divided the 867 days into training, validation, and test datasets, with percentages of 70%, 15%, and 15%, respectively. The chosen learning method was supervised backpropagation. This method involves adjusting the neural network’s weights in a backward manner, guided by the difference between the obtained value and the desired value. The goal is to progressively reduce the root-mean-square error (RMSE) calculated using the training dataset. Training halts when the RMSE, computed with MATLAB’s validation dataset, ceases to decrease for six consecutive epochs—a technique known as cross-validation. This approach helps mitigate the risk of overfitting. Additionally, apart from the RMSE, other statistical indices, such as normalized errors (NRMSE, NMAE, and NMBE), are closely monitored. These normalized errors are defined by the following equations, where

x_{f}

and

x_{o}

represent the forecasted and observed values, respectively:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{f, i} - x_{o, i})}^{2}}

(3)

N R M S E = \frac{R M S E}{x_{o, m a x}}

(4)

N M A E = (\frac{1}{n} \sum_{i = 1}^{n} |x_{f, i} - x_{o, i}|) / x_{o, m a x}

(5)

N M B E = (\frac{1}{n} \sum_{i = 1}^{n} x_{f, i} - x_{o, i}) / x_{o, m a x}

(6)

2.6. Reducing the Number of Variables to Define Model2

Model1 employs numerous input variables, and some of them carry redundant information, while others are not particularly valuable for forecasting. Consequently, a new model is introduced here, which operates exclusively with the essential variables. To discern which variables have the greatest influence on prediction, Model1 was retrained iteratively by excluding one input at a time. The input data were then ranked based on their impact on the accuracy of prediction, as determined by the RMSE calculated using the validation dataset. In addition to assessing the importance of variables for prediction accuracy, the Pearson correlation coefficient was employed to examine the interrelationships between the variables. This coefficient was calculated using the following formula:

r_{x y} = \frac{\sum (x - x_{m}) \cdot (y - y_{m})}{\sqrt{\sum {(x - x_{m})}^{2}} \cdot \sqrt{\sum {(y - y_{m})}^{2}}}

(7)

in which

x_{m} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}

is the average of

x

and

y_{m} = \frac{1}{N} \sum_{i = 1}^{N} y_{i}

is the average of

y

. If the coefficient is equal to zero, it suggests that the data are not linearly correlated. A coefficient greater than one indicates a positive linear correlation, while a negative coefficient signifies an inverse correlation. This coefficient is calculated to assess the relationships between all possible input pairs, as well as between the inputs and the output variable. The importance ranking and Pearson’s coefficient aid in the selection of the essential variables needed to create Model2. Like Model1, a thorough investigation was conducted to determine the most effective architecture for the neural network in Model2.

2.7. Tests and Sensitivity Analysis

Regarding both of the defined models, two tests were conducted to evaluate their real-world accuracy. The first test employed experimental data as inputs, gathered over 92 days (specifically, in January, April, and July 2019). Quantitative accuracy assessments were carried out using error indices such as NRMSE, NMBE, and NMAE. Additionally, a sensitivity analysis was carried out to explore the impact on performance resulting from errors in input variables. This analysis is critical because neural networks must be capable of functioning with NWP data as inputs, which can be subject to errors. The sensitivity analysis was performed using the same dataset, and it introduced perturbations in the hourly trends of the input variables to monitor the corresponding increase in RMSE calculated between the network’s output and the target. Three types of errors were considered:

(1): A Gaussian error on hourly values, where the variation from the actual value is defined randomly, following a probability density defined by the standard Gaussian curve.
(2): An offset error that uniformly increases all hourly values by the same amount.
(3): An offset error that uniformly reduces all hourly values by the same amount.

The systematic errors introduced in points 2 and 3 are based on the standard deviations of the input variables. Values obtained with the introduction of these errors are subsequently processed to correct situations that cannot occur, such as relative humidity exceeding 100% or wind speed dropping below zero.

The final test involved the use of NWP data for the next day as inputs. Weather forecasts were obtained from websites such as weather.com [30] and ilmeteo.it [31]. These forecasts were acquired between 5 March and 18 March 2022. The results of this test can also be influenced by errors in the weather forecast models. Figure 3 illustrates the predicted temperature and relative humidity obtained from both websites. The temperature forecast from weather.com exhibits a smaller daily temperature range compared to the actual data. On the other hand, the temperature forecast from ilmeteo.it mirrors the actual temperature range but often underestimates the actual temperature, particularly during nighttime. Similar observations can be made concerning relative humidity, where the actual data exhibit greater variation than what is provided by the forecast websites. Notably, values of 100% are recorded in the actual data, while such values are seldom seen in the forecasts.

3. Results

3.1. Model1 Architecture

The initial step in designing a neural network involves selecting the optimal architecture, including determining the numbers of hidden layers and neurons. It is important to note that, in neural networks, an increase in the number of nodes does not necessarily guarantee improved results (unlike situations where mesh densification is employed in fields such as mechanics and fluid dynamics). The only approach is to experiment with different configurations. To determine the best architecture for the networks, numerous trials were conducted by varying the number of nodes per layer from 3 to 35 and the number of hidden layers from 1 to 2. Each network underwent five separate training sessions with different sets of random initial weights. Out of these five training sessions, only the network that yielded the lowest RMSE on the validation dataset was retained. Table 3 displays the results for the ANN1 architectures, with statistical indices normalized against the maximum value

x_{o, m a x}

of 1713 Wh from the validation dataset. The choice of the most suitable configuration is based on the lowest NRMSE, which corresponds to the network with one hidden layer and only three neurons. This same network also demonstrates strong performance when compared to the others in terms of all of the statistical indices. The negative NMBE suggests a slight underestimation in the results. It is noteworthy that increasing the number of neurons tends to diminish the prediction performance. Table 4 presents the results regarding the selection of the architecture for ANN2. In this case, the maximum hourly value

x_{o, m a x}

is 241 Wh. The optimal network features one hidden layer and ten nodes. This configuration exhibits the most favorable behavior, including the NMAE and R² metrics, and generally underestimates the output result by approximately 1 Wh on average.

3.2. Reducing the Number of Variables

A more in-depth analysis is required to ascertain which variables are essential for the model. Figure 4 illustrates the distribution of data points and Pearson coefficients between all pairs of input variables within the complete dataset. This visualization helps identify connections between input variables, even those with nonlinear correlations that may not be evident through Pearson’s coefficient. Variables exhibiting strong mutual correlations provide redundant information, allowing for the removal of one of them. Conversely, a valuable correlation with

D P E

(the output for ANN1) is of significance. Key observations from this analysis include the following:

-: Day of year ( $D o Y$ ): Although not linearly related to any quantity, the distributions of data points in relation to daily temperatures suggest a connection between these variables. It appears that $D o Y$ could be deduced from air temperature data, and, to some extent, relative humidity is also influenced by $D o Y$ . However, daily production ( $D P E$ ) is not directly linked to $D o Y$ , as it can reach high values in all months, with lower values typically observed in the winter.
-: Temperatures ( $T_{m i n}$ , $T_{m a x}$ , and $T_{a v g}$ ): These temperature variables are highly interrelated. Notably, $T_{a v g}$ is the least influential variable among them, with Pearson coefficients of 0.97 when compared to $T_{m i n}$ and $T_{m a x}$ . Both $T_{m i n}$ and $T_{m a x}$ also exhibit a correlation with one another. However, their difference, which represents daily temperature fluctuations, can provide valuable information related to average cloudiness. Among the three temperatures, $T_{m a x}$ has the strongest correlation with $D P E$ , with a coefficient of 0.72. Additionally, these temperatures demonstrate an inverse correlation with relative humidity data.
-: Relative humidity ( ${R H}_{m i n}$ , ${R H}_{m a x}$ , and ${R H}_{a v g}$ ): ${R H}_{m i n}$ is a critical parameter, as it exhibits a strong negative correlation with $D P E$ (Pearson coefficient of −0.88). This suggests that it is an important parameter for estimating daily electricity production. In contrast, the average relative humidity ( ${R H}_{a v g}$ ) appears to be less critical, as it correlates well with several other known variables, providing similar information. The maximum relative humidity ( ${R H}_{m a x}$ ) often saturates at 100%, which could be used by ANNs to gauge the level of cloudiness.
-: Average wind speed ( $w s_{a v g}$ ) and hourly wind speed ( $w s$ ): These wind speed variables do not display strong correlations with any other inputs.
-: Hour of day ( $H o D$ ): This proves indispensable, as it demonstrates minimal correlation with other variables. This means that the unique information that it provides to ANN2 cannot be substituted by other inputs. Notably, $H o D$ exhibits a near-zero Pearson coefficient with almost all other inputs, except for its low correlations with hourly data for air temperature, relative humidity, and wind speed. Specifically, the point distributions reveal that air temperature and wind speed tend to peak in the early afternoon, while relative humidity decreases.
-: Hourly data: Regarding the hourly data, a negative correlation is observed between temperature ( $T$ ) and relative humidity ( $R H$ ), with $r_{x y}$ equal to −0.73.

Figure 5 presents the correlation index between the output variable, which is the hourly energy produced by photovoltaics (

H P E

), and all input parameters. In this context, high correlations indicate the usefulness of the input in predicting the output. Notable linear correlations are primarily observed with hourly air temperature and hourly relative humidity (0.47 and −0.58, respectively). There is a lower correlation with hourly wind speed (0.26). Daily variables do not significantly influence the hourly PV energy production. The graph depicting

{R H}_{m i n}

indicates that higher values, such as those close to 100%, correspond to lower energy production. Additionally,

D P E

exhibits a slight correlation with

H P E

; when daily energy production is low, hourly energy production is also low. As expected,

H P E

is uniformly distributed concerning the hour of day (

H o D

). While the relationship is not strictly linear,

H o D

plays a crucial role in ensuring that the network’s output aligns with the actual target.

To determine which variables could be eliminated, Model1 was retrained by systematically excluding one input at a time. This methodology allowed us to evaluate the significance of each variable in influencing the output. Changes in model performance were assessed by monitoring the RMSE on

H P E

with the validation dataset. Figure 6 illustrates that the RMSE increases when a variable is eliminated compared to the full model. The omission of

H o D

carries substantial weight, resulting in a roughly 50% increase in RMSE compared to the full model. Conversely, the information provided by

D o Y

proves to be redundant. The networks can comprehend this information by using the average, minimum, and maximum daily temperatures. Indeed, Figure 4 illustrates a strong correlation between these temperatures and the day of the year.

Furthermore, Figure 6 demonstrates that all daily mean values, denoted by the subscript av, are dispensable. Similarly, the maximum relative humidity does not contribute valuable information. In contrast, the minimum relative humidity, as previously noted, holds more significance than other daily humidity data, exhibiting a stronger correlation with the

D P E

. This is supported by its higher ranking compared to others.

{R H}_{m i n}

, along with

D P E

, holds a mid-ranking position, indicating their nearly equal influence, possibly due to the similarity in the information that they provide. Since the absence of both variables leads to a moderate increase in RMSE, it is reasonable to eliminate them, as they do not significantly contribute to the network. At the top of the importance ranking, the hourly quantities are present, along with the two daily minimum and maximum temperatures.

3.3. Model2 Design and Architecture

The analysis of the input variables’ importance led to the development of Model2, which exhibits a simplified structure (depicted in Figure 7) compared to Model1. The excluded variables in Model2 are

R H_{a v g}

,

{w s}_{a v g}

,

R H_{m a x}

,

T_{a v g}

,

R H_{m i n}

,

D P E

, and

T_{m a x}

. The criteria for elimination were primarily based on the ranking presented in Figure 5, with one exception. The

D o Y

was retained, as it is not subject to prediction error. Its significance in Model1 is minimal due to redundancy with the information provided by the three daily temperatures, as demonstrated in Figure 4. The elimination of some of these temperatures, which are susceptible to forecast errors, could restore importance to

D o Y

. Specifically,

T_{m a x}

was omitted, given that the simultaneous presence of

D o Y

,

T_{m i n}

, and

T_{m a x}

would offer redundant information. Although these three variables are interconnected, the network requires dual information: a daily temperature value for referencing hourly temperature values, and knowledge of the maximum daily fluctuation for characterizing sky coverage. Consequently,

D o Y

was retained, while

T_{m a x}

was removed due to its lower ranking compared to

T_{m i n}

. Since

D P E

was eliminated, there was no need to introduce the first neural network present in Model1. The structure of Model2 is consequently lighter and reduced to a minimum.

The neural network in Model2 is denoted as ANN3, featuring 10 input nodes and 1 output node. Similar to Model1, the architecture was determined through a trial-and-error process. Various configurations were tested, including those with one and two hidden layers, with the number of nodes in each layer ranging from 3 to 35. The results obtained are summarized in Table 5. The optimal network was identified as having one hidden layer with 10 nodes. On the validation dataset, this configuration achieved an NRMSE of 6.00%. Networks with two hidden layers exhibited inferior performance. The best configuration ultimately yielded an NMAE of about 3.4%, an NMBE of −0.4%, and an R² regression index of 0.949, indicating a strong correlation between the output and target values.

In analogy to the previous procedure with Model1, Figure 8 illustrates the ranking of the most important variables when the network is retrained with the omission of certain inputs. It is noteworthy that further elimination of some variables leads to a deterioration in results. Relative humidity gains increased significance compared to Model1, since information about the daily maximum and minimum limits of the same variable is no longer available. The minimum temperature loses positions in the ranking, and the least useful variable continues to be the day of the year. This underscores the importance of retaining certain variables to preserve the network’s performance and highlights the role of relative humidity in the absence of specific temperature data.

3.4. Testing and Sensitivity Analysis

The accuracy of the models must be assessed using the test dataset, and Table 6 presents the calculated statistical indices for both models. The results pertain to

H P E

. The NRMSEs are comparable to those obtained with the validation dataset, indicating the avoidance of the overfitting phenomenon.

3.4.1. Daily Electricity Forecast

Figure 9 illustrates the estimated daily electrical energy (

D P E

) produced by both models, comparing them with experimental data over the course of one year (2017). The models closely track the daily electrical energy, effectively capturing the nature of each day and providing reliable production estimates. During summer, challenges arise on cloudy days, while on winter days both models adeptly align with the experimental data.

In January, the daily photovoltaic (PV) electrical energy production from the test dataset was lower compared to other months. Despite the high variability of the atmospheric data, the models were able to follow the experimental trend with very good accuracy. In the months of April and July, the models performed well; however, in these months they faced more difficulties than in the winter ones, especially on cloudy days. In particular, on 9 April and 10 July, the models underestimated the electricity production. However, they correctly predicted the reduction in electricity on 29 July.

Figure 10 illustrates the distribution of predicted daily photovoltaic electrical energy in comparison to the observed data. Model1 exhibits an RMSE of approximately 128 Wh, with an R² regression index of 0.914. On average, the data are underestimated by about 23 Wh. Model2 shows a slightly higher RMSE, at 135 Wh, with an R² of 0.902 and an MBE of approximately −17.7 Wh. The latter is lower in absolute terms than that obtained with Model1. In both cases, the slope of the regression line is slightly lower than that of the quadrant bisector, and the intercepts of the lines are slightly higher than the origin of the axes. The most significant errors occur at points with intermediate magnitudes. The models demonstrate precision in identifying electricity production on both clear and overcast days, while encountering some difficulties on partly cloudy days. Despite these challenges and the utilization of only temperature, relative humidity, and wind speed data, the results can still be deemed highly satisfactory.

3.4.2. Hourly Electricity Forecast

The models generate outputs in the form of hourly electrical energy produced by the photovoltaic panel, as depicted in Figure 11 using results from the test dataset. For visualization purposes, five representative days are displayed for each of the three months, illustrating different sky conditions. On 1 January, a notable day, high electricity production was observed in the morning, before sharply declining to almost zero in the afternoon. Both models effectively predicted the morning production but struggled to accurately forecast the afternoon electricity values. The two cloudy days of the 2 January and 5 January were accurately predicted. The 3 January was a clear day, and this was also correctly assessed. However, the models exhibited imperfections on 4 January, failing to predict energy production during peak hours. This scenario aligns with days that do not fit squarely into the clear or cloudy categories. In April and July, electricity production was higher than in January. Notably, attention should be directed to the time trend during this analysis. In all cases, the energy produced was closely followed during morning and afternoon hours, with some disparities in the peak power recordings. The models satisfactorily estimated the unique trends observed on the mornings of 13 April and 16 April and 28–29 July. However, peak power was often underestimated, with an exception on 15 April. Overall, the models successfully identified experimental trends. Figure 12 presents regression lines for both models in relation to hourly data, with points distributed around the bisector of the quadrant. Similar to the findings for daily PV electricity values, forecasting challenges were more prominent for intermediate powers. Nonetheless, the majority of cases were accurately estimated, resulting in RMSE values ranging from 14 to 15 Wh in both instances. The regression index hovered around 0.95.

3.4.3. Sensitivity Analysis

The input variables coming from numerical weather prediction (NWP) are susceptible to errors, which can be significant. Prediction models must demonstrate the ability to respond appropriately even when used with inaccurate input data. Notably, the hour of the day and the day of the year are inherently error-free quantities. Therefore, the subsequent analysis focuses on investigating the models’ behavior solely in response to errors in temperature, relative humidity, and wind speed. To assess the impact of input errors, perturbations were introduced to the trends of these variables, monitoring the corresponding increase in the RMSE of the output—specifically, the predicted hourly photovoltaic electrical energy (HPE). Three types of errors were considered: (1) random error, (2) an upward offset error, and (3) a downward offset error, each equivalent to the standard deviation calculated from the test dataset. The graphs presented in Figure 13 illustrate the increase in the RMSE of the predicted HPE for the three error cases examined. The Gaussian error introduces modifications to the time trend, eliminating information on gradients with respect to the preceding and succeeding times. Perturbations are distributed around the actual mean values of the quantities, maintaining the overall trend of the variables. The graph indicates that temperature errors significantly impact the final result, with Model1 demonstrating better resilience than Model2. Conversely, errors introduced in relative humidity and wind speed lead to an approximate 5% increase, with Model2 exhibiting better adaptability than Model1.

The Gaussian error modifies the time trend, eliminating the information on the gradients with respect to the previous and next times. The perturbations are distributed around the actual mean values of the quantities. Thus, the overall trend of the variables remains the same. The graph shows that the error in the temperature has a significant effect on the final result. Model1 seems to react better than Model2. The same error introduced to relative humidity and wind speed caused an increase of about 5%. In this case, Model2 seems to react better than Model1.

The models are stable with respect to temperature and wind speed when their values are shifted upwards. However, both models suffer greatly from this type of error associated with relative humidity. On the other hand, in cases where the input variables are reduced by a constant value, the models continue to behave appropriately. The systematic error therefore only affects the models’ performance when the relative humidity is increased by a constant value. In fact, the models can use this information to detect overcast or rainy days. Overestimating relative humidity implies that the models perceive the day as cloudier than it actually is. Model1 appears to be more dependent on daily data, such as maximum relative humidity, while Model2, relying primarily on the hourly variations in quantities, exhibits a weaker dependency.

3.4.4. Tests with NWP Data

The tests were conducted utilizing the atmospheric forecast illustrated in Figure 3, and Figure 14 displays the hourly results. When using weather.com as a data source, the outcomes were satisfactory, except for 5 March, where the predicted production exceeded the actual production. On the cloudy days of 6–7 March, Model2 accurately followed the actual trend. However, the reduction in production at noon on 8 March was not predicted by either model. When ilmeteo.it was used as a source, the results showed a slight deterioration. Specifically, on 5 March, the models predicted very low production. The following day saw well-predicted morning hours, but a peak in afternoon production went undetected. On 7 March, the models underestimated the electricity production. Although the daily electricity production on 8 March was well predicted, the sudden reduction at 12:00 a.m. was not detected by either model. For clear days, Model2 appeared to be more accurate in predicting the hourly pattern, with both models recognizing these as clear days. Only on 11 March was the electricity production underestimated. The statistical indices presented in Table 7 reveal that using NWP from ilmeteo.it yields an RMSE of about 27 Wh with both models. The best results were obtained with NWP from weather.com, with Model2 performing the best, achieving an RMSE of 24.9 Wh. However, it should be noted that the overall performance was lower than in the previous test. Undoubtedly, the inherent errors in the weather forecasts from which the data were derived significantly impacted the estimation of electricity production.

Unlike this study, various other studies relying on NWP data have incorporated solar radiation as a predictive input. For instance, the model proposed in Ref. [32] achieved an NMAE of 6%, consistent with the findings of our current study. Using statistical methods, Giorgi et al. [33] reported forecast NRMSE values of 12.57%, 12.60%, and 10.91% for input vectors involving historical PV output, solar irradiance, and module temperature, respectively. On the other hand, Sharma et al. [34] obtained an NRMSE ranging from 9.42% to 15.41%, also incorporating solar radiation as an input. Therefore, the results of the current study can be considered very good, given the utilization of a reduced number of variables.

In addressing potential challenges, it is important to recognize the impact of varying wind speeds on the method’s performance. The training dataset used in this study reflects a diverse range of wind speed conditions. However, it is noteworthy that the reliability of the method may be influenced by the specific patterns of temperature and humidity coupling, which can be different in other locations. In particular, the robustness of the proposed method may face challenges when applied to locations with distinct temperature and humidity pairings. Therefore, it is crucial to acknowledge the limitations of the model’s generalization across diverse climatic regions. Subsequent investigations and additional training with datasets from various locations may be required to enhance the method’s validity in such cases.

4. Conclusions

Photovoltaics is emerging as a pivotal technology in harnessing renewable energy sources, playing a crucial role in the transition toward a decarbonized energy future. The primary objective of this study was to forecast the electricity generated by photovoltaic panels on the following day. This prediction was achieved using easily accessible input data obtained from weather forecast websites (specifically, air temperature, relative humidity, and wind speed).

To accomplish this goal, two forecasting models with hourly resolution were developed based on artificial neural networks. The first model incorporates various data as inputs, including hourly values, and is supplemented with processed daily values to aid in identifying the type of day (i.e., overcast, clear, or partly cloudy). Subsequent analysis enabled the determination of the relative importance of each variable, leading to the elimination of redundant or unhelpful information. Model2 shares the same objective as the initial model but employs a reduced set of input data while maintaining a similar accuracy. The training process utilized experimental data gathered over three years at the University of Calabria. Notably, it was found that only a single hidden layer for the feedforward networks was sufficient, eliminating the need for multiple hidden layers. The key conclusions drawn from this study include the following:

(1): The day of the year is not important for the prediction, as similar information is provided by the minimum and maximum daily temperatures.
(2): The daily minimum relative humidity correlates with the daily PV energy production, with a good Pearson’s coefficient: −0.88.
(3): The models are stable if the input variables have a constant offset error.
(4): The most valuable information for the prediction is the hourly temperature trend.
(5): The models provide very good estimates when using experimental data as inputs. The coefficient of determination is about 0.95, with an RMSE of about 15 Wh.
(6): The accuracy of the forecast slightly decreases when the input information is taken from weather forecast websites. The coefficient of determination was 0.879 in the two weeks analyzed. The RMSE was 24.9 Wh. The accuracy of the forecast is closely linked to the accuracy of the NWP data. The results are dependent on source data, but they are nevertheless appreciable.
(7): The good behavior of Model2 implies that it is not necessary to provide too much information. Hourly trends of the three meteorological quantities and the daily minimum temperature are sufficient.

The limitation of this study is that the networks were trained on local climatic conditions. It would be interesting to assess whether they also perform well in different locations. Despite this limitation, our research has yielded valuable insights into electricity generation forecasting, addressing the challenges posed by the variable availability of solar sources—a concern that is gaining significance. The findings derived from experimental measurements offer valuable information for understanding the factors that exert the most influence on forecasting accuracy.

The practical implications of this work extend to utilities, where the economic impact is manifested through energy savings achieved via effective scheduling of electrical loads and an increase in self-consumed energy. The model, reliant on easily accessible data from websites, is usable by everyone. Leveraging public data ensures the seamless expansion and integration of this technology into control systems, facilitating its broader applicability. The incorporation of these models into smart grid frameworks represents a promising trajectory. The models’ hourly resolution aligns seamlessly with the dynamic nature of smart grids, enabling real-time adjustments and fostering an interconnected, responsive energy ecosystem. Microgrid architectures, which often rely on renewable sources, could benefit from the precision of these models in adapting to the fluctuations inherent in distributed energy systems. By facilitating informed decision-making in energy consumption patterns, these models contribute to the broader mission of transitioning towards sustainable and environmentally conscious energy practices. Moreover, as technological landscapes evolve, the adaptability of these models can be explored in conjunction with emerging technologies such as IoT (Internet of things) devices and advanced sensors. In essence, the hourly PV models’ versatility positions them as catalysts for holistic advancements in the realm of renewable energy utilization.

Author Contributions

Conceptualization, F.N.; methodology, F.N.; software, F.N.; validation, F.N.; formal analysis, F.N. and P.B.; investigation, F.N.; resources, P.B.; data curation, F.N. and P.B.; writing—original draft preparation, F.N.; writing—review and editing, P.B.; visualization, F.N.; supervision, F.N.; project administration, F.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lu, Q.; Guo, Q.; Zeng, W. Optimization scheduling of home appliances in smart home: A model based on a niche technology with sharing mechanism. Int. J. Electr. Power Energy Syst. 2022, 141, 108126. [Google Scholar] [CrossRef]
Goyal, G.R.; Vadhera, S. Multi-interval programming based scheduling of appliances with user preferences and dynamic pricing in residential area. Sustain. Energy Grids Netw. 2021, 27, 100511. [Google Scholar] [CrossRef]
Boland, J.; David, M.; Lauret, P. Short term solar radiation forecasting: Island versus continental sites. Energy 2016, 113, 186–192. [Google Scholar] [CrossRef]
Wang, G.; Su, Y.; Shu, L. One-day-ahead daily power forecasting of photovoltaic systems based on partial functional linear regression models. Renew. Energy 2016, 96, 469–478. [Google Scholar] [CrossRef]
Li, Y.; Su, Y.; Shu, L. An ARMAX model for forecasting the power output of a grid connected photovoltaic system. Renew. Energy 2014, 66, 78–89. [Google Scholar] [CrossRef]
Reikard, G. Predicting solar radiation at high resolutions: A comparison of time series forecasts. Sol. Energy 2009, 83, 342–349. [Google Scholar] [CrossRef]
Yaman, K.; Arslan, G. A detailed mathematical model and experimental validation for coupled thermal and electrical performance of a photovoltaic (PV) module. Appl. Therm. Eng. 2021, 195, 117224. [Google Scholar] [CrossRef]
Bevilacqua, P.; Perrella, S.; Bruno, R.; Arcuri, N. An accurate thermal model for the PV electric generation prediction: Long-term validation in different climatic conditions. Renew. Energy 2020, 163, 1092–1112. [Google Scholar] [CrossRef]
Aly, S.P.; Ahzi, S.; Barth, N.; Abdallah, A. Using energy balance method to study the thermal behavior of PV panels under time-varying field conditions. Energy Convers. Manag. 2018, 175, 246–262. [Google Scholar] [CrossRef]
Notton, G.; Cristofari, C.; Mattei, M.; Poggi, P. Modelling of a double-glass photovoltaic module using finite differences. Appl. Therm. Eng. 2005, 25, 2854–2877. [Google Scholar] [CrossRef]
Aly, S.P.; Ahzi, S.; Barth, N.; Figgis, B.W. Two-dimensional finite difference-based model for coupled irradiation and heat transfer in photovoltaic modules. Sol. Energy Mater. Sol. Cells 2018, 180, 289–302. [Google Scholar] [CrossRef]
Marwaha, S.; Pratik, P.; Ghosh, K. Thermal Model of Silicon Photovoltaic Module with Incorporation of CFD Analysis. Silicon 2021, 14, 4493–4499. [Google Scholar] [CrossRef]
Abdel-Nasser, M.; Mahmoud, K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl. 2019, 31, 2727–2740. [Google Scholar] [CrossRef]
Han, S.; Qiao, Y.-H.; Yan, J.; Liu, Y.-Q.; Li, L.; Wang, Z. Mid-to-long term wind and photovoltaic power generation prediction based on copula function and long short term memory network. Appl. Energy 2019, 239, 181–191. [Google Scholar] [CrossRef]
Gao, M.; Li, J.; Hong, F.; Long, D. Short-Term Forecasting of Power Production in a Large-Scale Photovoltaic Plant Based on LSTM. Appl. Sci. 2019, 9, 3192. [Google Scholar] [CrossRef]
Pedro, H.T.C.; Coimbra, C.F.M. Assessment of forecasting techniques for solar power production with no exogenous inputs. Sol. Energy 2012, 86, 2017–2028. [Google Scholar] [CrossRef]
Lin, K.-P.; Pai, P.-F. Solar power output forecasting using evolutionary seasonal decomposition least-square support vector regression. J. Clean. Prod. 2016, 134, 456–462. [Google Scholar] [CrossRef]
Netsanet, S.; Zheng, D.; Zhang, W.; Teshager, G. Short-term PV power forecasting using variational mode decomposition integrated with Ant colony optimization and neural network. Energy Rep. 2022, 8, 2022–2035. [Google Scholar] [CrossRef]
Mellit, A.; Pavan, A.M. A 24-h forecast of solar irradiance using artificial neural network: Application for performance prediction of a grid-connected PV plant at Trieste, Italy. Sol. Energy 2010, 84, 807–821. [Google Scholar] [CrossRef]
Yeo, I.-A.; Yee, J.-J. A proposal for a site location planning model of environmentally friendly urban energy supply plants using an environment and energy geographical information system (E-GIS) database (DB) and an artificial neural network (ANN). Appl. Energy 2014, 119, 99–117. [Google Scholar] [CrossRef]
Chen, C.; Duan, S.; Cai, T.; Liu, B. Online 24-h solar power forecasting based on weather type classification using artificial neural network. Sol. Energy 2011, 85, 2856–2870. [Google Scholar] [CrossRef]
Lee, D.; Kim, K. PV power prediction in a peak zone using recurrent neural networks in the absence of future meteorological information. Renew. Energy 2021, 173, 1098–1110. [Google Scholar] [CrossRef]
Mittal, M.; Bora, B.; Saxena, S.; Gaur, A.M. Performance prediction of PV module using electrical equivalent model and artificial neural network. Sol. Energy 2018, 176, 104–117. [Google Scholar] [CrossRef]
Jung, Y.; Jung, J.; Kim, B.; Han, S. Long short-term memory recurrent neural network for modeling temporal patterns in long-term power forecasting for solar PV facilities: Case study of South Korea. J. Clean. Prod. 2020, 250, 119476. [Google Scholar] [CrossRef]
Almonacid, F.; Pérez-Higueras, P.; Fernández, E.F.; Hontoria, L. A methodology based on dynamic artificial neural network for short-term forecasting of the power output of a PV generator. Energy Convers. Manag. 2014, 85, 389–398. [Google Scholar] [CrossRef]
Manasrah, A.; Masoud, M.; Jaradat, Y.; Bevilacqua, P. Investigation of a Real-Time Dynamic Model for a PV Cooling System. Energies 2022, 15, 1836. [Google Scholar] [CrossRef]
Hossain, M.; Mekhilef, S.; Danesh, M.; Olatomiwa, L.; Shamshirband, S. Application of extreme learning machine for short term output power forecasting of three grid-connected PV systems. J. Clean. Prod. 2017, 167, 395–405. [Google Scholar] [CrossRef]
Shapsough, S.; Dhaouadi, R.; Zualkernan, I. Using Linear Regression and Back Propagation Neural Networks to Predict Performance of Soiled PV Modules. Procedia Comput. Sci. 2019, 155, 463–470. [Google Scholar] [CrossRef]
Boger, Z.; Guterman, H. Knowledge extraction from artificial neural network models. In Proceedings of the IEEE Systems, Man, and Cybernetics Conference, Orlando, FL, USA, 12–15 October 1997. [Google Scholar]
Available online: https://weather.com/ (accessed on 30 March 2022).
ilMeteo s.r.l. Available online: https://www.ilmeteo.it/ (accessed on 30 March 2022).
Böök, H.; Lindfors, A.V. Site-specific adjustment of a NWP-based photovoltaic production forecast. Sol. Energy 2020, 211, 779–788. [Google Scholar] [CrossRef]
De Giorgi, M.G.; Congedo, P.M.; Malvoni, M. Photovoltaic power forecasting using statistical methods: Impact of weather data. IET Sci. Meas. Technol. 2014, 8, 90–97. [Google Scholar] [CrossRef]
Sharma, V.; Yang, D.; Walsh, W.; Reindl, T. Short term solar irradiance forecasting using a mixed wavelet neural network. Renew. Energy 2016, 90, 481–492. [Google Scholar] [CrossRef]

Figure 1. Summary diagram.

Figure 2. Model1’s structure.

Figure 3. Temperature and relative humidity forecasts from weather.com [30] and ilmeteo.it [31], compared to experimental measurements.

Figure 4. Mutual correlation between inputs.

Figure 5. Correlation between the inputs and target.

Figure 6. RMSE on the hourly energy produced with reference to the validation dataset. Networks trained by eliminating single inputs from Model1.

Figure 7. Model2’s structure.

Figure 8. RMSE on the hourly energy produced with reference to the validation dataset. Networks trained by eliminating single inputs from Model2.

Figure 9. Daily electrical energy production with the training and test datasets.

Figure 10. Daily PV energy output–target regression with the test dataset.

Figure 11. Hourly electrical energy production with the test dataset.

Figure 12. Hourly PV energy output–target regression with the test dataset.

Figure 13. Increase in RMSE for

H P E

with errors in input variables.

Figure 13. Increase in RMSE for

H P E

with errors in input variables.

Figure 14. Tests with NWP.

Table 1. Possible inputs.

Input	Abbreviation
Day of year	$D o Y$
Daily minimum temperature	$T_{m i n}$
Daily maximum temperature	$T_{m a x}$
Daily mean temperature	$T_{a v g}$
Daily minimum relative humidity	${R H}_{m i n}$
Daily maximum relative humidity	${R H}_{m a x}$
Daily mean relative humidity	${R H}_{a v g}$
Daily mean wind speed	${w s}_{a v g}$
Hour of day	$H o D$
Hourly temperature	$T$
Hourly relative humidity	$R H$
Hourly wind speed	$w s$

Table 2. Climatic data sensor specifications.

	Temperature	Relative Humidity	Wind Speed
Sensor	Pt 100 1/3	Capacitive	Cup anemometer
Range	−50–70 °C	0–100%	0–75 m/s
Accuracy	0.1 °C	±1.5%	2.5%

Table 3. Definition of ANN1’s architecture. Statistical indices on the validation dataset.

ANN1
Hidden Neurons	NRMSE	NMBE	NMAE	R²
3	9.57%	−1.5%	7.4%	0.858
5	9.91%	−0.8%	7.5%	0.844
10	10.39%	−0.8%	7.9%	0.830
15	9.59%	−1.5%	7.4%	0.856
20	9.74%	−0.7%	7.5%	0.849
25	10.23%	−2.3%	8.0%	0.841
30	9.61%	−0.6%	7.3%	0.853
35	9.65%	−2.5%	7.7%	0.864
3-3	10.26%	0.3%	7.5%	0.838
5-5	9.97%	−1.0%	7.9%	0.844
10-10	10.05%	−0.3%	7.8%	0.839
15-15	9.87%	−1.4%	7.3%	0.848
20-20	9.63%	−0.9%	7.7%	0.854
25-25	9.62%	−0.3%	7.5%	0.854
30-30	9.89%	−3.2%	7.5%	0.861
35-35	10.60%	−1.8%	8.0%	0.830

Table 4. Definition of ANN2’s architecture. Statistical indices on the validation dataset.

ANN2
Hidden Neurons	NRMSE	NMBE	NMAE	R²
3	6.37%	−0.3%	3.3%	0.942
5	6.37%	−0.4%	3.4%	0.942
10	5.86%	−0.4%	2.9%	0.951
15	6.20%	−0.6%	3.1%	0.946
20	6.27%	−0.3%	3.2%	0.944
25	6.17%	−0.4%	3.1%	0.945
30	6.48%	−0.5%	3.2%	0.940
35	6.58%	−0.5%	3.3%	0.938
3-3	6.42%	−0.4%	3.2%	0.941
5-5	6.14%	−0.3%	3.1%	0.946
10-10	6.26%	−0.3%	3.0%	0.944
15-15	6.05%	−0.5%	2.9%	0.948
20-20	6.36%	−0.3%	3.1%	0.942
25-25	6.25%	−0.3%	3.0%	0.944
30-30	6.62%	−0.5%	3.3%	0.937
35-35	6.72%	−0.3%	3.2%	0.935

Table 5. Definition of ANN3’s architecture. Statistical indices on the validation dataset.

ANN3
Hidden Neurons	NRMSE	NMBE	NMAE	R²
3	6.47%	−0.4%	3.4%	0.940
5	6.12%	−0.3%	3.0%	0.946
10	6.00%	−0.3%	2.9%	0.949
15	6.08%	−0.2%	3.0%	0.947
20	6.22%	−0.5%	3.0%	0.945
25	6.17%	−0.5%	3.1%	0.946
30	6.13%	−0.3%	3.0%	0.946
35	6.29%	−0.1%	3.1%	0.943
3-3	6.40%	−0.1%	3.2%	0.941
5-5	6.29%	−0.4%	3.1%	0.943
10-10	6.09%	−0.2%	2.9%	0.947
15-15	6.07%	−0.6%	2.9%	0.947
20-20	6.13%	−0.2%	3.0%	0.946
25-25	6.30%	−0.2%	3.2%	0.943
30-30	6.15%	−0.3%	3.1%	0.946
35-35	6.44%	−0.2%	3.1%	0.940

Table 6. Statistical indices for

H P E

on the test dataset.

Table 6. Statistical indices for

H P E

on the test dataset.

Testing
	NRMSE	NMBE	NMAE	R²
Model1	5.93%	−0.3%	3.1%	0.952
Model2	6.11%	−0.4%	3.4%	0.948

Table 7. Statistical indices for

H P E

with NWP.

Table 7. Statistical indices for

H P E

with NWP.

Testing—Input Data Source: weather.com
	NRMSE	NMBE	NMAE	R²
Model1	12.8%	0.9%	6.7%	0.851
Model2	11.6%	0.8%	5.5%	0.879
Testing—Input Data Source: ilmeteo.it
	NRMSE	NMBE	NMAE	R²
Model1	12.5%	−0.2%	6.1%	0.854
Model2	12.9%	1.3%	6.0%	0.859

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nicoletti, F.; Bevilacqua, P. Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support. Energies 2024, 17, 466. https://doi.org/10.3390/en17020466

AMA Style

Nicoletti F, Bevilacqua P. Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support. Energies. 2024; 17(2):466. https://doi.org/10.3390/en17020466

Chicago/Turabian Style

Nicoletti, Francesco, and Piero Bevilacqua. 2024. "Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support" Energies 17, no. 2: 466. https://doi.org/10.3390/en17020466

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support

Abstract

1. Introduction

2. Materials and Methods

2.1. The Proposed ANN Models

2.2. Artificial Neural Network

2.3. Selection of Input Data

2.4. Model1

2.5. ANN Training Procedure

2.6. Reducing the Number of Variables to Define Model2

2.7. Tests and Sensitivity Analysis

3. Results

3.1. Model1 Architecture

3.2. Reducing the Number of Variables

3.3. Model2 Design and Architecture

3.4. Testing and Sensitivity Analysis

3.4.1. Daily Electricity Forecast

3.4.2. Hourly Electricity Forecast

3.4.3. Sensitivity Analysis

3.4.4. Tests with NWP Data

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI