1. Introduction
In recent times, solar energy is considered one of the promising sources of renewable energy in fulfilling an important part of the world’s energy demand [
1,
2]. Therefore, accurate knowledge of solar radiation is regarded as the basic step in solar energy availability assessment [
3]. Additionally, it serves as the first input for different solar energy applications [
4,
5]. Solar radiation data are unavailable at many sites around the world due to the high cost of the devices, equipment calibration, and maintenance requirements [
6,
7]. For this reason, several solar radiation models are proposed for estimating global solar radiation (GSR). These models are suggested to forecast solar radiation using various methodologies including empirical methods, geostationary satellite pictures, various artificial intelligence (AI) techniques such as artificial neural networks (ANNs), time series methods, physically radiative transfer models, and stochastic weather methods [
8,
9,
10,
11,
12,
13].
Generally, the empirical approach is the most commonly used one, which mainly relies on the correlation between global solar radiation and different meteorological parameters such as sunshine data, temperature, relative humidity, etc. Ångström [
14] originally presented the primary sunshine-based solar radiation model. Ångström’s model was updated by Prescott [
15] and has become the most widely employed model globally for predicting solar radiation [
16,
17]. Youssef et al. [
18] assessed the efficacy of 31 non-sunshine-based solar models to forecast global solar radiation on a horizontal plane. The models that provide the most accurate prediction were identified, as well as the best ones among all developed models. Hassan et al. [
19] introduced new temperature-based solar models for evaluating global solar radiation. The obtained results showed that the newly presented models have excellent predictions for global solar radiation at various locations. Moreover, the newly suggested formula of the best temperature-based solar model also performs better than the two most accurate sunshine-based solar models from the literature. Mostafa et al. [
20] studied the proficiency of fifty-two solar radiation models that utilize sunshine data to compute the global solar radiation on horizontal surfaces, using Jouf City, KSA, as a case study. The results revealed that some models are not suitable for usage in Jouf City, while others exhibited different behavior. Similarly, Mahmoud S. Audi [
21] proposed a statistical analysis of nine solar and other climatic parameters in Amman, Jordan. M.A. Alsaad [
22] presented a correlation to estimate the average global solar radiation incident on a horizontal plane in Amman, Jordan. Barbaro et al.’s model [
23] was updated by Robaa [
24] to predict global solar radiation in Egypt. The findings showed that the modified model outperformed other ones in predicting global solar radiation across Egypt. A new model was developed by Ajayi et al. [
25] to estimate the daily potential global solar radiation in Nigeria. The results revealed a high level of agreement between model predictions and observed data. El-Metwally presented a simple solar radiation model for computing global solar radiation [
26]. According to the results, the developed model provides a good estimation of global solar radiation on a horizontal plane. Quej et al. [
27] investigated the efficacy and applicability of thirteen solar models based on various meteorological parameters such as temperature for estimating global solar radiation in the Yucaton Peninsula, Mexico.
However, one limitation of using the empirical method is that it may underperform when applied to model nonlinear systems. With the acknowledgment of its potential, the usage of machine-learning (ML) technology in environmental and renewable energy applications has expanded. The most widely used ML technique is artificial neural networks (ANNs). ANNs provide an alternative to overcome this difficulty. ANNs, which are increasingly employed to tackle complex practical challenges, are known as universal function approximators. Their growing usage in data analysis highlights their potential as a viable alternative to more conventional approaches in various scientific domains. ANNs have the capability of accurately approximating any continuous nonlinear function. ANN-based models have been effectively employed to model various solar radiation parameters, particularly in the meteorological and solar energy resources domains [
28,
29,
30]. Jiang [
31] conducted a study on the computation of monthly average daily global solar radiation using ANN and compared it with other empirical models in China. Similarly, Şenkal and Kuleli [
32] evaluated solar radiation in Turkey using both artificial neural networks and satellite data. Their ANN model utilized Scale Conjugate Gradient (SCG) and Resilient Propagation (RP) learning algorithms along with the logistic sigmoid transfer function. The results demonstrated a good agreement between the estimated values from both the ANN and satellite, as indicated by the correlation values for the twelve locations considered in the study. Similarly, Rahimikhoob [
33] conducted a study on predicting global solar radiation based on temperature data using ANN in a semi-arid environment in Iran. The study also included a comparison with an empirical model, namely the Hargreaves and Samani model [
34]. The results demonstrated that the developed ANN model outperformed the empirical model in accurately modeling daily global solar radiation.
In an effort to anticipate daily global solar radiation (GSR) in three locations in southwest Algeria—Bechar, Tindouf, and Naâma—Benatiallah et al. [
35] proposed an artificial neural network (ANN) model. The results demonstrated that, over a five-year period, the Cascade-forward Neural Network (CFNN) and Feed-forward Neural Network (FFNN) models provided significantly improved predictions of daily GSR in the selected sites. Kaushika et al. [
36] presented a direct method-based ANN model that considered the relationship features of diffuse, direct, and global solar radiations. Their findings revealed that the ANN model exhibited exceptional consistency with the data, yielding an overall mean biased error (MBE) of −0.194% and root mean square error (RMSE) of 5.19% for GSR estimation. To identify the most influential input variables for solar radiation prediction in ANN-based models, Yadav et al. [
37] conducted research using WEKA software (Waikato Environment for Knowledge Analysis) on 26 Indian locations with varying climates. The findings indicated that temperature (ambient, minimum, and maximum), sunlight duration, and altitude were deemed the most crucial input components for solar radiation prediction, while longitude and latitude exhibited the least bearing on solar radiation. Yadav and Chandel [
38] suggested evaluating existing ANN-based methods for estimating solar radiation and emphasized the need for further research. Their study revealed that ANN algorithms outperformed conventional techniques in accurately predicting solar radiation. In a review paper by Choudhary et al. [
39], the development of ANN-based models for solar radiation forecasting was examined. The study concluded that ANN-based models exhibited significantly higher precision compared to other approaches.
Lately, with the availability of vast amounts of gathered data from across the world and developments in computer capabilities, researchers working in this field are increasingly drawn to deep-learning (DL) approaches for constructing prediction models. The DL methods are part of a larger family of ML-based techniques that depend on ANNs with representation-based learning, which became more significant in a variety of fields. For example, the majority of research on the prediction of solar irradiation relies on offline models, such as ML and deep neural networks (DNNs) [
40]. In general, DNNs include several algorithms such as long short-term memory (LSTM), gated recurrent unit (GRU), one-dimension convolutional neural networks (CNN1D), and various hybrid configurations such as CNN1D-LSTM are thought to be one of the powerful tools in time series forecasting [
41]. In general, models based on ML and DL were created to address complex issues by extracting useful facts from large amounts of data. Furthermore, the performance for both approaches, DL and ML, in GSR prediction (hourly) were compared, as presented in [
42]. The results show that while DL techniques provided better GSR prediction than ML techniques, the performance difference was not considerable. Additionally, whereas DL methods showed better GSR prediction performance, it is worth noting that the period for training/testing the ML methods (excluding support vector regression) makes them more preferred, particularly when the computational power is considered. In other words, it can be said that ML-based models use less data and take less time to calculate than DL-based models. The study also revealed that ANN is one of the suitable algorithms for accurate GSR prediction.
ANNs, in general, are the most extensively used ML approach. In the realm of solar radiation estimation or forecasting, the majority of studies have predominantly focused on the utilization of artificial neural network (ANN) techniques and empirical models [
43]. This inclination can be attributed to the widespread adoption and extensive utilization of ANN as the leading artificial intelligence (AI) technique in solar radiation prediction. Notably, ANN is often used as a benchmark against which the performance of other AI models is evaluated [
43,
44,
45]. Regarding ANN architecture, single-hidden-layer feed-forward ANNs have been widely acknowledged as versatile approximators capable of representing continuous functions effectively [
43,
46]. Feed-forward neural network models have garnered significant attention due to their simplicity of implementation and ability to provide accurate representations of measurable functions, particularly in weather prediction [
47,
48]. Consequently, this study will specifically focus on the ANN architecture with a single-hidden layer, recognizing its importance among various alternatives. For instance, Khosravi et al. [
45] developed an ANN model (5, 150, 1) to estimate hourly global solar radiation (GSR). This model comprised an input layer with five parameters, a hidden layer with 150 neurons, and an output layer representing GSR. Their results demonstrated that the developed Multilayer Feed-Forward Neural Network (MLFFNN) yielded optimal performance in GSR estimation. Similarly, Yildirim et al. [
49] investigated an ANN model (9, 15, 1) for estimating daily GSR in the Turkish Eastern Mediterranean Region. In another study [
50], two ANN-based models, namely the Multi-Layer Perceptron (MLP), were established to estimate hourly Direct Normal Irradiation (DNI) and daily GSR, utilizing ANN architectures of (4, 5, 1) and (7, 10, 1), respectively. The obtained results exhibited a strong correlation between the estimated and recorded values. Moreover, a study [
51] proposed an ANN model (7, 10, 1) for forecasting daily GSR on a horizontal surface in major sites across Zimbabwe. The findings revealed that the developed model provided accurate estimations, supported by robust statistical indicators. In Morocco, Ihya et al. [
52] introduced two ANN models, MLP with (2, 10, 1) and (3, 10, 1), to estimate hourly and daily diffuse solar fractions in Fez, respectively. Similarly, Elminir et al. [
53] developed two ANN models (3, 40, 1) and (5, 40, 1) to estimate diffuse fractions at daily and hourly scales in Egypt. Furthermore, Bosch et al. [
54] proposed an ANN model (3, 10, 1) for forecasting daily solar radiation across mountainous areas in Spain. The results indicated that ANN, employing data from a single radiometric station, proved to be an effective and straightforward approach for estimating solar radiation levels in challenging mountainous terrains. Additionally, Ozan Senkal and Tuncay Kuleli [
32] introduced an ANN model (6, 6, 1) to predict solar radiation in Turkey. In general, optimizing the architecture of an artificial neural network (ANN) is crucial for achieving efficient performance. It is well known that both under-fitting (low complexity and a small number of neurons) and over-fitting (excessive number of neurons) can result in poor efficiency. Therefore, careful consideration must be given to selecting the appropriate number of neurons in the hidden layer. Currently, there is no mathematically justified method for determining the ideal number of hidden neurons. Utilizing a large number of neurons can increase the network’s training time and compromise its generalization and prediction capabilities. Conversely, using a small number of neurons may fail to capture the relationships between preceding and subsequent values, leading to inadequate modeling. Consequently, the trial-and-error method is commonly employed to determine the optimal number of hidden neurons within the hidden layer [
31,
55].
Moreover, the literature highlights the lack of attention given to determining the appropriate number of neurons within the hidden layer of an artificial neural network (ANN), which is considered a limitation [
56,
57]. Thus, it is essential to identify the optimal design of an ANN to effectively address the prediction of global solar radiation (GSR). While ANNs have been widely employed in various studies to estimate GSR, little emphasis has been placed on the design of the ANN model itself, specifically in determining the suitable number of neurons within the hidden layer to achieve an optimal ANN architecture. Given their highly nonlinear nature and ability to capture complex relationships within data without relying on predefined assumptions, Artificial neural networks have proven to be effective tools for simulating solar radiation. Therefore, this article aimed to optimize the design of ANN, one of the widely adopted and faster machine-learning (ML) algorithms, to enhance the accuracy of GSR forecasting while also optimizing computational resources. Consequently, this study focused on optimizing the number of neurons in the hidden layer to obtain the most effective ANN architecture for precise GSR prediction. Additionally, a solar radiation model was developed for a specific study site located at Lat. 30°51′ N and Long. 29°34′ E, which lacks an AI-based model, despite the presence of several proposed solar energy projects in the area, such as the ‘Solar-Greenhouse Desalination System Self-productive of Energy and Irrigating Water Demand’ project and the ‘Multipurpose Applications by Thermodynamic Solar (MATS)’ project. Additional figures and information can be found in the
Appendix A. Evaluating solar radiation predictions is an essential initial step in assessing the feasibility and performance of such solar energy application projects.
Thus, the following points summarize the novel aspects and contributions of the presented work:
The development of an accurate global solar radiation (GSR) model for the study location, which currently lacks an AI-based model, despite the presence of several planned solar energy projects in the area;
The optimization of the architecture for one of the fastest and most widely used machine-learning algorithms, artificial neural network (ANN), to enhance the precision of solar radiation prediction while conserving computational resources;
The adoption of hidden layer neurons to establish the most significant ANN model for accurate GSR estimation, addressing the existing research gap in this area;
The investigation of the impact of varying the number of neurons in the hidden layer on the proficiency of the ANN-based model in achieving high-accuracy GSR prediction;
The assessment of the performance of the recently introduced Hassan et al. model [
19], one of the best temperature-based models for GSR estimation, over a prolonged period of years, and a comparative analysis of its performance against ANN, which has not been previously compared;
Conducting a comprehensive comparative study between the ANN method and the empirical method for global solar radiation, providing valuable insights for designers, engineers, and stakeholders involved in feasible solar energy applications at the study site.
This detailed study presents significant information that is relevant to designers, engineers, and stakeholders interested in solar energy applications at the study site.
The performance of the established models, including the artificial neural network (ANN) models and the empirical model, was evaluated by comparing their predictions with the observed data of global solar radiation at the study location, New Borg El-Arab, Egypt (Latitude 30°51′ N and Longitude 29°34′ E). The assessment included the computation of commonly used statistical indicators such as Mean Percentage Error (MPE), Mean Bias Error (MBE), Mean Absolute Percentage Error (MAPE), Mean Absolute Bias Error (MABE), Root Mean Square Error (RMSE), Relative Error (e), Coefficient of Determination (
), and Correlation Coefficient (
r) [
9,
16,
58,
59,
60,
61]. These indicators were employed to evaluate the performance of the models and determine the most suitable model for the given task.
The remainder of the manuscript is organized as follows:
Section 2.1 provides a description of the utilized dataset, including Global Solar Radiation data and other relevant parameters, as well as the methodology employed for calculating extraterrestrial solar radiation. In
Section 2.2 and
Section 2.3, an in-depth explanation of AI for solar radiation prediction and ANNs and their functioning are presented, respectively.
Section 2.4 and
Section 2.5 detail the development of the ANN-based models and the empirical-based model, respectively.
Section 3 focuses on discussing the key indicators commonly used to evaluate the performance of the models. The findings and discussion of the results, including a comparative analysis of the performance of both techniques, are presented in
Section 4. Finally,
Section 5 concludes the manuscript and outlines potential avenues for future research.
4. Results and Discussion
This section presents the findings obtained from optimizing the developed ANN models. The number of neurons in the hidden layer was varied from one to fifty to determine the optimal ANN architectures. The measured data of daily ambient temperature and global solar radiation were divided into two sets and averaged to obtain the monthly average daily values. The first set, spanning from 1 January 1984 to 31 December 2017, was used to construct both models. For the empirical model, regression analysis was employed to derive the empirical coefficients corresponding to the actual data of the study location, as outlined in
Table 1 [
16,
25,
97]. In contrast, the developed ANN models were trained using the MATLAB neural network toolbox, with all training data normalized between the range of 0 and 1 prior to the training phase.
As mentioned earlier, this study had several objectives: to optimize the design of the ANN, one of the most widely used and efficient machine-learning techniques, for accurate GSR forecasting and computational efficiency. Therefore, determining the appropriate number of neurons in the single hidden layer of the ANN is crucial in developing a robust architecture for precise GSR prediction. Consequently, different numbers of neurons, ranging from one to fifty, were investigated in the hidden layer. Each ANN model was trained, and the best results from ten runs of each model were selected. The optimal performance of each trained ANN model was identified and is summarized in
Table 2.
On the other hand, the second set, covering the period from 1 January 2020 to 31 December 2020, was utilized to evaluate and validate the developed models (empirical and ANN models) using various statistical indicators. The predicted values of the global solar radiation from both models were compared against the measured data at the selected site. The evaluation indicators (MPE, MAPE, RMSE, MBE, MABE, r, and ) were obtained using Equations (12)–(19) and the best models were recognized and are indicated in bold. The following subsections provide a detailed discussion focusing on the results obtained from both techniques (ANN method and empirical method), as well as their performance comparison.
4.1. Impact of Neurons Numbers Variation on ANN Prediction Accuracy
Regarding the developed ANN-based solar models, the values of various performance indicators such as MBE, MPE,
t-Test, RMSE, MAPE, MABE,
r, and
were calculated for each model. These values are then summarized in
Table 3, utilizing Equations (12)–(19). Based on the revealed results, the statistical errors for all developed ANN models were in the acceptable range ±10%. except for the ANN model which had 46 neurons in its hidden layer, “Model_NurnNo._46”, its MAPE exceeded the acceptable range, MAPE equals 14.9119%. Additionally, while Model_NurnNo._46 had the worst performance, Model_NurnNo._3 provided the best performance followed by Model_NurnNo._1, Model_NurnNo._2, and Model_NurnNo._4, where their Coefficient of Determination,
, was higher than 0.98%. As mentioned before,
suggests a good fitting between the model’s prediction and observed values of global solar radiation. The best ANN model, Model_NurnNo._3, had good values of all indicators,
t-test, MPE, MBE, RMSE, MAPE, MABE,
, and
as 0.7968, 2.4354%, 0.1953 (MJ/m
2 day
−1), 0.8361 (MJ/m
2 day
−1), 4.3401%, 0.7216 (MJ/m
2 day
−1), 0.9991, and 0.9838%, respectively.
Figure 6a shows its predictions compared to the measured data throughout the year. The prediction in the winter months was slightly overestimated, and this may return to different weather conditions such as clouds, rains, and winds [
19,
72,
98].
Additionally, it is worthy of note that, firstly, ANN architectures with fewer neurons number in the hidden layer, from 1 to 4 neurons, gave the best performance compared with other models, where their
> 0.98%. Secondly, the performance of the developed ANN models was approximately stable and excellent when neurons number were less than 10 neurons in the hidden layer, and they were very close to each other, with
> 0.97%, as seen in
Table 3. Furthermore, while ANN’s architectures—in some cases—with large neurons number in the hidden layer performed well, like “Model_NurnNo._32” with
equals 0.97986%, there were many other ANN models that varied in their performance with
ranges from 0.48% to 0.97%, and performance instability was observed. For more clarification, the performance of the developed ANN models against the variation of neuron numbers in the single hidden layer is clarified in
Figure 7. Additionally, they were arranged based on their performance (Rank) as seen in
Table 4, where the first ninth-ranked models had few neurons number in the hidden layer (less than ten neurons) except for in Models 11 and 32.
Moreover, the relative error, e, was calculated for each month of all developed ANN models based on Equation (20) and presented in
Table 5. It is worth noting that the values for the best ANN model, Model_NurnNo._3, fell within the preferred range of ±10% for all months, except for November and December, where they slightly exceeded the range at 11.9% and 10.8%, respectively. Similarly, the values of the second and third-ranked models, Model_NurnNo._1 and Model_NurnNo._2, were within the range, except for some winter months where they marginally surpassed the range. This can be attributed to varying climatic conditions, particularly in winter, such as rain, clouds, and wind [
19,
96,
98].
In general, it is evident that while some models exhibited relative error values within the range for all months, such as Model_NurnNo._32, others slightly exceeded the range during certain winter months. On the other hand, certain models significantly surpassed the range, like Model_NurnNo._14 in January, with a relative error of 30%. Furthermore, although many ANN models demonstrated good statistical errors (MBE, MPE, t-test, RMSE, MAPE, MABE, r, and R2), their relative error noticeably exceeded the range, particularly in winter months. For instance, in January, the relative error for Models 13, 14, 15, 21, 25, and 35 exceeded 20%. Finally, the ANN models with the most favorable relative error values were Model_NurnNo._3, Model_NurnNo._32, and Model_NurnNo._1.
4.2. Performance Comparison with Conventional Methods
Of particular interest, the performance of the best ANN model (Model_NurnNo._3) was compared to that of the conventional models (empirical model). As previously mentioned, Hassan et al. [
19] recently proposed new temperature-based solar models for estimating global solar radiation, which have not been compared to any ML models before. The empirical coefficients for the best Hassan et al.’s model (Equation (7)) were computed and are presented in
Table 1, and its statistical errors were obtained and compared to those of the best ANN model as illustrated in
Table 6. The results revealed that Hassan et al.’s model demonstrated excellent performance with favorable indicator values, including
t-test, MPE, MBE, RMSE, MAPE, MABE,
r, and
, with values of 2.4991, −3.1889%, −0.4620 (MJ/m
2 day
−1), 0.7676 (MJ/m
2 day
−1), 3.6733%, 0.6021 (MJ/m
2 day
−1), 0.9981, and 0.9864%, respectively. Furthermore, while both models (ANN and empirical) exhibited a similar performance, Hassan et al.’s model outperformed the ANN model with the highest
value of 0.9864% [
25]. The predictions of each model, as well as the comparison with the measured data, are presented in
Figure 6a–c. Similarly, their statistical indicators are depicted in
Figure 8.
Additionally, the relative errors for the empirical model were calculated using Equation (20) and were compared to those of the best ANN model, as shown in
Table 7. It is evident that the relative errors of the developed empirical model, Hassan et al. [
19], for all months of the year fell within the acceptable range of ±10%, including the winter months. Conversely, the values of the best ANN model, Model_NurnNo._3, slightly exceeded the range in November and December, at 11.9% and 10.8%, respectively. This can be attributed to different weather factors, especially during the winter season, such as clouds, wind, and rain [
19,
72,
97,
98].
Figure 9 illustrates the relative errors for both models throughout the year at the study location, New Borg El-Arab City, Alexandria, Egypt.
4.3. Influence of the Learning Rate on ANN Prediction Accuracy
More importantly, the influence of the learning rate was investigated to know its effect on ANN prediction and accuracy. Thus, the best-developed ANN architecture, ANN Model with three neurons in its hidden layer (Model_NurnNo._3) was used to assess the impact of varying learning rates on its performance. Another five learning rates, 0.5, 0.1, 0.05, 0.005, and 0.001, were examined and compared with the most commonly used one, 0.01, where all performance indicators were obtained and are summarized in
Table 8. The revealed results show that the obtained accuracy for all tested learning rates was very good with R
2 higher than 97%. Additionally, while ANN models with learning rates of 0.05, 0.01, 0.005, and 0.001 were very close to each other in their performance, the ANN model with a learning rate of 0.01 (the most commonly used one) provided the best accuracy followed by ANN models with learning rates of 0.001, 0.05, and 0.005, respectively. For more clarification, the accuracy of the used leaning rate, 0.01 (best one), is illustrated in
Figure 10. Generally, it can be noted that there was no significant effect of learning rate variation on the accuracy and the prediction of the best-developed ANN architecture, where the variance in models’ performance was too small.
4.4. Comparison with Previous Related Work
Furthermore, the obtained results from the study were compared with the related work reported in the state of the art, both at the level of solar radiation prediction (long-term forecast) and at the level of machine-learning and deep-learning techniques. In terms of using DL-based models for long-term GSR prediction, as mentioned before, using the DL technique to forecast in the long term is rare, where it is almost employed in prediction short-term GSR prediction such as minutely and hourly. However, a recent study in an acclaimed journal (Energies Journal) utilized DL algorithms for long-term GSR prediction in four Australian cities [
82]. The study employed different DL algorithms such as Deep Neural Networks (DNN) and Deep Belief Networks (DBN) for estimating long-term GSR (monthly scale). Different architectures of both algorithms were investigated and the best-developed DL-based models were compared with the most common ML-based methods such as ANN (single hidden layer), Decision Tree (DT), and Random Forest (RF). For more clarification, the best two architectures of DL-based models and the two best ML-based models were selected and are presented in
Table 9. The results showed that the two DL-based models (DNN and DBN) and ANN (ML-based model) outperformed all other data-driven models in terms of accuracy.
More significantly, it is worth of note that the performance of both DL models, DBN and DNN, were very close to each other. However, the DBN model had the best performance at all sites with RMSE values between 0.503 and 0.773 (MJ/m
2 day
−1) and correlation coefficient,
, values between 0.974 and 0.997, respectively. Moreover, the ANN (single hidden layer) model provided the best performance compared with all examined ML-based models at the four selected locations, with RMSE and
values ranging from 0.653 to 1.276 (MJ/m
2 day
−1) and from 0.972 and 0.997, successively. This indicates that the performance of ANN and DL-based models was very close to each other, as demonstrated in
Table 9. On the other hand, we compared the results of our optimized ANN model with the best DL-based and ML-based models from a previous study [
82]. The table below,
Table 10, shows the RMSE and r values for each model. Our optimized ANN model had a very good RMSE value of 0.8361 MJ/m
2 day
−1, which falls within the ranges of both DL and ML techniques. In contrast to the previous study, our optimized ANN model had a better
value of 0.999. For more clarification, the RMSE and
values of the optimized ANN model in this study were compared with those of the previous work and are represented in
Figure 11 and
Figure 12. It is notable that the optimized artificial neural network (ANN) model in this study demonstrated strong performance, coming very close to the top-performing models based on deep-learning techniques.
Overall, it can be mentioned that while the efficiency difference between DL and ML techniques is insignificant, especially when predicting monthly average daily GSR, it is noteworthy that the period needed to train or test ML techniques makes them advantageous, particularly when computational power is included. Therefore, these revealed results support and strengthen the main objectives of this work, which aim to improve the accuracy of solar radiation forecasts while preserving computing resources by optimizing the design of ANNs, one of the quickest and most popular machine-learning algorithms. Additionally, the obtained results from the current work are in line with the previous up-to-date and related work [
42,
82].
Based on the obtained results, it can be concluded that the developed models in this study, namely the best ANN model and the empirical model [
19], exhibited higher accuracy in estimating Global Solar Radiation. These models, based on temperature, demonstrate a high level of applicability, and can be effectively integrated with various short-term or long-term weather forecasting methods. Furthermore, the development of ANN-based solar radiation models with a limited number of neurons in the single hidden layer (less than 10 neurons) shows great promise for accurately predicting global solar radiation. Additionally, the Hassan et al. model [
19], represented by Equation (7), proved to be a reliable empirical tool for accurately predicting global solar radiation.
5. Conclusions and Future Work
This study aimed to optimize the design of the artificial neural network (ANN), one of the widely used machine-learning algorithms, for accurate global solar radiation (GSR) forecasting while minimizing computational requirements. The focus was on optimizing the neurons in the hidden layer, an aspect that has received limited attention in the literature, to develop the most effective ANN architecture for precise GSR estimation. Additionally, the study proposed accurate solar radiation models specifically tailored for the study site, which currently lacks ML-based models, and where several solar energy projects are planned. Furthermore, the study investigated the impact of varying the number of neurons in the hidden layer on the performance of the ANN-based solar radiation model. It also assessed the performance of the Hassan et al. model [
19], a leading temperature-based empirical model, which has not been compared with ML-based models such as ANN before. Finally, the study conducted a comparative analysis between the ANN method and the empirical method for estimating global solar radiation on a horizontal plane. To achieve these objectives, the measured data of global solar radiation over a period of 35 years at the study location, New Borg El-Arb, were utilized for model development and validation.
The results demonstrate that the developed models in this study, specifically the best ANN model (Model_NurnNo._3) and the empirical model (Hassan et al. model), provide an excellent estimation of global solar radiation, with a coefficient of determination () exceeding 0.98%. Moreover, their other statistical indicators fell within acceptable ranges. Notably, ANN architectures with a smaller number of neurons in the single hidden layer, ranging from 1 to 4 neurons, exhibited the best performance compared to other ANN models, with values exceeding 0.98%. The performance of the developed ANN models remained stable and excellent when the number of neurons in the hidden layer was less than ten, with values exceeding 0.97%. However, performance instability was observed when the number of neurons in the hidden layer exceeded nine.
Furthermore, the comparison between the best ANN-based model (Model_NurnNo._3) and one of the best empirical-based models, the Hassan et al. model [
20], revealed that both models demonstrated an excellent performance, with
values exceeding 0.98%. While the performance of both models was quite similar, the Hassan et al. model outperformed the best ANN model, exhibiting the highest
value. The performance indicators of the Hassan et al. model included
t-test, MPE, MBE, RMSE, MAPE, MABE,
r, and
, with values of 2.4991, −3.1889%, −0.4620 (MJ/m
2 day
−1), 0.7676 (MJ/m
2 day
−1), 3.6733%, 0.6021 (MJ/m
2 day
−1), 0.9981, and 0.9864%, respectively. Additionally, while the relative error for the best ANN model slightly exceeded the acceptable range of ±10% in November and December, the relative error for the empirical model (the Hassan et al. Model) remained within the range, even during winter months.
Additionally, the obtained results of the optimized ANN model in this work were compared with the recent related work, both at the level of solar radiation prediction (long-term forecast) and at the level of machine-learning (ML) and deep-learning (DL) techniques. While it had a good RMSE value of 0.8361 MJ/m2 day−1, which falls within the ranges of both DL and ML techniques, its correlation coefficient () was the best one, which equaled 0.999. This demonstrates its ability in improving the accuracy of solar radiation forecasts while preserving computing resources, since the efficiency difference between DL and ML techniques was insignificant, especially when predicting monthly average daily GSR. Additionally, the influence of the learning rate on its accuracy was examined, where the best one was 0.01. Consequently, the presented models in this study, the best ANN model and the empirical model, demonstrated high accuracy in forecasting global solar radiation, making them suitable for various research projects at the study site. Moreover, the temperature-based solar radiation models developed in this study can be effectively combined with different long or short-term weather forecasting methods, enhancing their applicability.
In future studies, it is recommended to explore additional ANN architectures and training algorithms to evaluate their impact on prediction accuracy. Additionally, the performance of other models (AI-based and empirical-based models) can be investigated and compared with the proposed ones in this study, using different locations, particularly coastal areas.