Next Article in Journal
A Review of Model Predictive Controls Applied to Advanced Driver-Assistance Systems
Previous Article in Journal
How to Reach the New Green Deal Targets: Analysing the Necessary Burden Sharing within the EU Using a Multi-Model Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hybrid Machine Learning for Solar Radiation Prediction in Reduced Feature Spaces

by
Abdel-Rahman Hedar
1,2,
Majid Almaraashi
3,
Alaa E. Abdel-Hakim
1,4,* and
Mahmoud Abdulrahim
5
1
Department of Computer Science in Jamoum, Umm Al-Qura University, Makkah 25371, Saudi Arabia
2
Department of Computer Science, Assiut University, Assiut 71526, Egypt
3
Department of Computer Sciences, College of Computer Sciences and Engineering, University of Jeddah, Jeddah 23218, Saudi Arabia
4
Electrical Engineering Department, Assiut University, Assiut 71516, Egypt
5
Department of Meteorology, Faculty of Meteorology, Environment and Arid Land Agriculture, King Abdulaziz University, Jeddah 22254, Saudi Arabia
*
Author to whom correspondence should be addressed.
Energies 2021, 14(23), 7970; https://doi.org/10.3390/en14237970
Submission received: 30 October 2021 / Revised: 18 November 2021 / Accepted: 19 November 2021 / Published: 29 November 2021
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)

Abstract

:
Solar radiation prediction is an important process in ensuring optimal exploitation of solar energy power. Numerous models have been applied to this problem, such as numerical weather prediction models and artificial intelligence models. However, well-designed hybridization approaches that combine numerical models with artificial intelligence models to yield a more powerful model can provide a significant improvement in prediction accuracy. In this paper, novel hybrid machine learning approaches that exploit auxiliary numerical data are proposed. The proposed hybrid methods invoke different machine learning paradigms, including feature selection, classification, and regression. Additionally, numerical weather prediction (NWP) models are used in the proposed hybrid models. Feature selection is used for feature space dimension reduction to reduce the large number of recorded parameters that affect estimation and prediction processes. The rough set theory is applied for attribute reduction and the dependency degree is used as a fitness function. The effect of the attribute reduction process is investigated using thirty different classification and prediction models in addition to the proposed hybrid model. Then, different machine learning models are constructed based on classification and regression techniques to predict solar radiation. Moreover, other hybrid prediction models are formulated to use the output of the numerical model of Weather Research and Forecasting (WRF) as learning elements in order to improve the prediction accuracy. The proposed methodologies are evaluated using a data set that is collected from different regions in Saudi Arabia. The feature-reduction has achieved higher classification rates up to 8.5% for the best classifiers and up to 15% for other classifiers, for the different data collection regions. Additionally, in the regression, it achieved improvements of average root mean square error up to 5.6 % and in mean absolute error values up to 8.3%. The hybrid models could reduce the root mean square errors by 70.2% and 4.3% than the numerical and machine learning models, respectively, when these models are applied to some dataset. For some reduced feature data, the hybrid models could reduce the root mean square errors by 47.3% and 14.4% than the numerical and machine learning models, respectively.

1. Introduction

Solar energy is considered as a major source for future renewable energy [1]. As the dependence on renewable energy increases, more attention to solar energy is paid. Solar radiation data are the main ingredient of optimum design and operations of solar power systems [2]. It is necessary to ensure the stability of the energy supplied by solar stations. Therefore, accurate prediction of the amount of solar radiation at a specific location is critical from an operational perspective. For parties such as governments, enterprises, and energy operators, solar radiation prediction is a key for optimal strategic plans, particularly when hybridized with different energy sources. However, such an objective is associated with practical difficulties. Particularly, the potential of solar energy is limited by inaccuracy of solar radiation levels prediction when compared with certain alternative resources. In order to handle this problem, several prediction models have been proposed in the literature to predict solar radiation, including numerical weather prediction (NWP) and artificial intelligence models, e.g., [3,4,5,6,7,8]. However, the large number of parameters associated with the prediction process, including weather and topography variables, significantly affects the underlying prediction models. Therefore, it is crucial to obtain a good representative set of these parameters, or features as termed in machine learning, to improve the predictor performance as well as reducing the computational cost of the real-time prediction systems.
NWP models can provide forecasts of solar radiation several days ahead along with other weather parameters, such as temperature, air pressure, relative humidity, or wind speed [9]. Such information can be useful for optimizing solar plant operating strategies. These models rely on atmospheric reanalysis to obtain initial and boundary conditions for the model run before it is realistically downscaled to a finer physical resolution using few physical equations. An NWP model that downscales reanalysis data is called a mesoscale model. As mesoscale models run within a smaller area compared with global-scale models, they include additional details. Therefore, these models can provide forecasts of solar irradiance with a high temporal spatial resolution over a wide area but with high levels of computing power. The Weather Research and Forecasting (WRF) model [10] is the most commonly-used mesoscale model, and it has been extensively applied and assessed. In this paper, a nonhydrostatic WRF v3.7.1 model has been applied to simulate dust storm events over Saudi Arabia to evaluate the reliability of global horizontal irradiance (GHI) forecasts.
Regarding artificial intelligence (AI) models, A large number of AI models for predicting solar radiation or solar power have been proposed. For example, AI models have been applied to predict solar radiation using fuzzy logic sets and systems [11,12], neuro-fuzzy systems [13], neural networks [14,15,16], machine/deep learning [17,18,19,20,21,22,23,24], and LSTM [25,26,27,28,29]. There are some other regression tools that are based on statistical models linear and non-linear regression, specially for seasonally-repeated patterns, e.g., Prophet [30,31]. Automated Time Series Models in Python (AtsPy) [32] provides a software package to compare the forecasting performance of about ten other regression algorithms along with Prophet.
Abdel-Nasser et al. [33] developed an LSTM-based method for solar irradiance forecasting. They used LSTM models with an aggregation function based on Choquet integral. Combining Choquet integral with LSTM aimed at achieving more accurate predictions due to the memory units and the recurrent architecture which can model the temporal changes in solar irradiance. The interaction between aggregated inputs are modeled by the Choquet integral through a fuzzy measure.
Almaraashi [34] applied fuzzy logic systems that are designed and optimized using fuzzy c-means clustering (FCM) and simulated annealing (SA) algorithms to forecast global horizontal irradiance (GHI) in eight stations in Saudi Arabia. In addition, Almaraashi predicted daily solar radiation in the same eight stations in Saudi Arabia using multi-layer neural networks (NNs). This was done after applying four-feature selection methods to discover the most important variables [35]. The used four-feature selection methods are the Relief algorithm, Random-Frog algorithm, Monte Carlo Uninformative Variable Elimination algorithm (MCUVE), and Laplacian Score algorithm (LS). A hybrid model presented by Voyant et al. [36] applied the NWP model combined with a hybrid auto-regressive moving average (ARMA) and neural networks to forecast hourly global radiation for five locations in the Mediterranean area.
Boubaker et al. [37] have investigated one-day prediction of GHI using various DNN models at Hail city, Saudi Arabia. They used six different DNN models: LSTM, BiLSTM, GRU, Bi-GRU, onde dimensional CNN, and other hybrid configurations such as CNN-LSTM and CNN-BiLSTM. The used DNN models depend only on historical daily values of GHI. However, These models did not take into consideration crucial weather parameters that may affect GHI, e.g., air temperature, humidity, wind speed, wind direction, and atmospheric pressure.
The intuitive parameter selection by experts when predicting solar radiation can result in different sets of possible input parameters in which some might appear to be redundant or irrelevant. In addition, the manual selection of the most relevant features for this problem is affected by the large dimensionality of the input feature space. Given such a case, the automatic dimension reduction of the input feature space can be a valuable solution.
Solar energy prediction needs large amounts of data, which require a large number of measuring devices and equipment. Moreover, the calculations of weather data needed for the solar energy prediction process are often computationally expensive. Therefore, one of the most important motivations for this research is to reduce the data reading and calculation processes required for solar energy prediction and to reduce the cost of this process. This helps to expand prediction operations in a broader and more comprehensive way, even beyond the scope of traditional measurement stations. Another major motivation for this paper is to use the power of smart and hybrid systems in predicting short-term solar energy levels. Therefore, in this paper, a modified version of the tabu search attribute reduction (TSAR) [38] is presented as a feature selection method along with different prediction models for the estimation of solar radiation levels. The main modifications of that method are adding more local search and extending some other search operations. Consequently, various classification and regression models are designed to predict solar radiation based on reduced features. Moreover, other hybrid predictive models are formulated in order to utilize the outputs of the WRF numerical model as learning elements to increase prediction accuracy. In addition to the proposed prediction models, the impact of the attribute reduction mechanism on different classification and regression models is investigated.

2. Methodology

The proposed methodology comprises different design elements, including feature selection, classification, regression, and numerical models. In this section, these design elements are discussed and their integration in creating our prediction models are illustrated. The main layout of the proposed methodology is presented in Figure 1. Four possible methods can be implemented using this layout based on:
  • Invoking reduced solar feature data or not;
  • Applying only machine learning prediction models or hybrid models with the numerical WRF models.

2.1. Feature Selection

The proposed feature selection (FS) method is designed on the basis of a modified version of the Tabu Search Attribute Reduction (TSAR) method [38]. The proposed method adds more local search and extends some other search operations. The modified feature selection is denoted by mTSAR. The FS method selects the best features in order to use them in building classifiers and prediction models. The main steps in the work-flow of the presented method are highlighted in Figure 2, and detailed in the following subsections.

2.1.1. Solution Representation

The FS method encodes its solutions in binary vectors. The dimension of these vectors is equal to the number of conditional features. Therefore, if the entity of the coding-vector has a value of 1, it implies that the corresponding feature is included in the solution represented by this vector. Otherwise, this feature is not included in that solution.

2.1.2. Feature Selection Evaluation

The dependency degree concept in the rough set theory [39] is invoked to evaluate the goodness of reducts or solutions. Therefore, the feature selection problem can be defined in terms of maximizing the dependency degree values of the solutions and minimizing their cardinality. The dependency degree function of a solution (feature reduct) can be computed using the following definitions [39]:
  • Indiscernibility Relation. Given a set A of all condition features and a feature subset P A , the indiscernibility relation is denoted by I N D ( P ) and defined as:
    I N D ( P ) = { ( ξ , η ) U × U a P , a ( ξ ) = a ( η ) } .
  • Indiscernibility Equivalence. The relation I N D ( P ) forms an equivalence relation on the set U. The relation I N D ( P ) represents a partition of U denoted by U / I N D ( P ) . For any pair ( ξ , η ) I N D ( P ) , it can be said that ξ and η are indiscernible by features of P. The P-indiscernibility equivalence classes are denoted by [ ξ ] P .
  • Lower and Upper Approximation. Given a subset Ξ U , one can define the P-lower approximation of Ξ by:
    P ¯ Ξ = { ξ | [ ξ ] P Ξ } .
    Moreover, one can define the P-upper approximation of Ξ by:
    P ¯ Ξ = { ξ | [ ξ ] P Ξ } .
  • Positive Region. The positive region of the partition of U / I N D ( Q ) with respect to P is defined as the set of all members of U that can be uniquely classified to blocks of the partition U / I N D ( Q ) using the knowledge in P, which can be determined by:
    P O S p ( Q ) = Ξ U / I N D ( Q ) P ¯ Ξ .
  • Dependency Degree ( γ ). The dependency degree is the ratio of all objects of U that can be appropriately classified to the blocks of the partition U / I N D ( Q ) by means of P. This dependency degree is denoted by γ P ( Q ) and determined by:
    γ P ( Q ) = | P O S P ( Q ) | | U | ,
    where | · | is the cardinality measure.
Therefore, the dependency degree can be stated as the ratio of all objects of U that can be classified to the blocks of the partition U / I N D ( Q ) using P.
A feature subset Q is said to depend totally or partially on another feature subset P, if γ P ( Q ) = 1 , or γ P ( Q ) < 1 , , respectively. In order to measure the quality of a solution x, the dependency degree γ x ( D ) of decision attribute D can be used. Therefore, for two solutions x and y, x is better than y if one of the following conditions holds:
  • γ x ( D ) > γ y ( D ) ;
  • γ x ( D ) = γ y ( D ) , and | x | < | y | ,
where | x | and | y | are the number of features in x and y, respectively.

2.1.3. Initialization

A random binary vector is generated as an initial solution. Lists of tabu and elite are initialized as empty lists. The most recently visited solutions are placed in the tabu list to prevent being trapped in local optimal solutions. Moreover, the elite list contains the best solutions found up to now. Then, they can be included in the steps of intensification.

2.1.4. Search Procedures

The key search procedures of the proposed FS system are identical with minor modifications to those of our previously-published system [38]. Specifically, the proposed method begins with an initial solution and continues to produce trial solutions within the neighborhood of the current one. The stop criterion is met when no improvement is accomplished during of a predefined number of consecutive iterations. Thereafter, the search process initiates a diversification step from a new diverse solution. If the number of such consecutive non-improvement iterations reaches another pre-defined number of iterations, an intensification step is initiated to refine the best solution achieved so far. If the number of iterations reaches a maximum permitted iteration limit, the search is terminated. Lastly, the search process uses a final step for diversification-intensification to obtain the final solution. The details of the neighborhood and local search procedures are explained below.
  • Neighborhood Search. The neighborhood of the current iterate solution x = ( x 1 , , x n ) is broken down into a fixed number of neighborhood zones denoted by Z j , j = 1 , , , and expressed as:
    Z j ( x ) = { x j : x j = ( x 1 j , , x n j ) } ,
    where
    x i j x i , i 1 , , i j { 1 , , n } , and i 1 i j , = x i , otherwise .
    Within each zone, the search process generates a trial solution in accordance with the tabu restriction to avoid revisiting recent solutions.
  • Solution and Memory Updates. The next iterative solution is selected as the best trial one among the generated solutions. Thereafter, the tabu and elite lists are revised.
  • Local Search. Using a local search technique called Shaking [38], the best solution is improved by attempting to sequentially eliminate its attributes without raising its degree of dependency value. The steps of the shaking procedure are depicted in Figure 3, which is a modified version of the original shaking procedure in [38]. The standard shaking technique [38] is used only to lower the cardinality of the best reducts whose γ function values are equal to 1. However, the modified shaking procedure applies the feature reductions to both total or partial reducts.

2.1.5. Diversification

Whenever diversification is required, it is possible to generate a new diverse solution. The attributes included in a diverse solution are generated with probability that is inversely proportional to their appearance in the previously generated solutions.

2.1.6. Final Intensification

In order to produce new promising solutions, the common features that appear in the elite solutions can be utlilized. In particular, the reducts obtained are stored in a package called the Reduct Set (RedSet). Then, the term “core” is defined as the intersection of all reducts saved in the RedSet. Thus, a test solution x F i n a l is constructed as the intersection of the best m reducts in RedSet. Therefore, a new solution candidate x F i n a l is generated to contain the core. The trial solution x F i n a l is only considered if its number of features is less by at least two than the best obtained reducts. Thereafter, new features will be added to x F i n a l by converting the zero positions in x F i n a l in which the highest γ -value is given one. This upgrading process continues until a suitable new solution is found.

2.1.7. Control and Termination

Three non-improvement counters ( I l o c a l , I d i v , I g l o b a l ) are used to control the processes of applying the local search, diversification, and final intensification, where I l o c a l < I d i v < I g l o b a l . Specifically, if a number of non-improvement iterations, I l o c a l , is reached, the shaking procedure is utilized. Then, if the number of non-improvement iterations is increased and reaches I d i v , a new diverse solution is generated. Finally, if the number of non-improvement iterations exceeds I g l o b a l , then the final intensification is employed to refine the best solutions in RedSet.

2.2. Prediction Models

Several solar radiation prediction models for the global horizontal irradiance (GHI) values are proposed on the basis of classification, regression, numerical and hybrid techniques. The target is to obtain predicted values of the GHI (in Wh/m2 per day) through regression models or predicted classes of different ranges of the GHI values through classification models. All prediction models are created in two versions; with or without feature selection. The numerical models of weather forecasting are inlaid within hybrid regression models to obtain improved predicted values that are hopefully better than the values obtained by the pure numerical or machine learning models.

2.2.1. Machine Learning Models

The range of solar radiation energy can be discretized into a certain number of classes. Then, several classifiers are used to predict the classes of solar radiation energy. The following classifiers are invoked in this work.
  • Decision Trees. Binary decision trees are multi-class learners in which decisions are followed in the shape of a tree, from its root node down to its leaf nodes that contain the response [40]. Different structures of decision trees can be used in classification based on the number of leaves used to make distinctions among classes. The number of leaves could be low, medium, or high corresponding to classification models coarse, medium or fine decision trees, respectively. The optimizable model employs certain techniques to automatically tune model hyper-parameters.
    1.
    Fine Decision Tree;
    2.
    Medium Decision Tree;
    3.
    Coarse Decision Tree;
    4.
    Optimizable Decision Tree.
  • Discriminate Analysis. Discriminant analysis assumes that data are generated by different classes based on different Gaussian distributions. Therefore, these classifier models attempt to estimate the parameters of a Gaussian distribution that fit each class [41]. Two common types of such classifier models are linear and quadratic discriminant analysis apart from the optimizable discriminate analysis in which model hyper-parameters are automatically tuned:
    5.
    Linear Discriminate Analysis;
    6.
    Quadratic Discriminate Analysis;
    7.
    Optimizable Discriminate Analysis.
  • Naïve Bayes Classifiers. These classifiers classify data in two steps. In the first step, the classifier estimates the probability distribution parameters assuming that the predictors are conditionally independent, given the class based on some training data. In the second step, the classifier considers other unseen test data samples and computes the posterior probability of that samples belonging to each class [42]. Then, the method classifies the test data according to the largest posterior probability. Such a classifier model invokes different techniques, such as Gaussian, kernel predictors, or an optimizable technique:
    8.
    Gaussian Naïve Bayes;
    9.
    Kernal Naïve Bayes;
    10.
    Optimizable Naïve Bayes.
  • Support Vector Machine (SVM). This classifier uses a separate hyper-plane to classify data into two classes [43]. Different trick kernels can be utilized, if the data are not linearly separable. Moreover, the classifier can deal with multi-class classification through different upgrading criteria. Several kernels and modifications can be used to design the following SVM models.
    11.
    Linear SVM;
    12.
    Quadratic SVM;
    13.
    Cubic SVM;
    14.
    Fine Gaussian SVM;
    15.
    Medium Gaussian SVM;
    16.
    Coarse Gaussian SVM;
    17.
    Optimizable SVM.
  • Nearest Neighborhood Classifiers. A nearest neighbor can find other nearest neighbors within a defined distance to search data points based on specified distance metrics such as Euclidean and Hamming [44,45].
    18.
    Fine KNN;
    19.
    Medium KNN;
    20.
    Coarse KNN;
    21.
    Cosine KNN;
    22.
    Cubic KNN;
    23.
    Weighted KNN;
    24.
    Optimizable KNN Classifiers.
  • Ensemble Classifiers. A classification ensemble is a prediction model that comprises a weighted combination of several models for classification. In general, combining multiple classification models improves predictive performance. Ensemble classifiers use boosting, random forest, bagging, random subspace, and error-correcting output codes ensembles for multi-class learning [46].
    25.
    Ensemble Boosted Decision Trees;
    26.
    Ensemble Bagged Decision Trees;
    27.
    Ensemble Subspace Discriminant Analysis;
    28.
    Ensemble Subspace KNN;
    29.
    Ensemble RUS Boosted Decision Trees;
    30.
    Optimizable Ensemble Classifiers.
  • Neural Networks. Artificial neural networks are a subset of machine learning that are at the heart of deep learning algorithms. Their name and structure are based on the human brain, and they mimic the way in which organic neurons communicate starting from the input layer to output layer passing through hidden layers. They prove great success in different applications [47]. The following neural network models are used with various size and number of hidden layers.
    31.
    Narrow Neural Networks;
    32.
    Medium Neural Networks;
    33.
    Wide Neural Networks;
    34.
    Bilayered Neural Networks;
    35.
    Trilayered Neural Networks.
Regression models can be implemented to predict certain amounts of solar radiation energy by estimating GHI values. One of the most powerful regression models is the Gaussian Process Regression (GPR) model. The GPR model is a non-parametric kernel-based probabilistic model [48,49], which measures the similarity between training data to predict the value for test data.

2.2.2. Numerical Model

Numerous prediction models using NWP of solar radiation have been applied [4,5]. A recent and effective numerical model is the WRF mesoscale model [50]. The WRF model serves both needs for atmospheric research and operational forecasting. The WRF model is fitted with two dynamic (computational) cores, a data assimilation system and a software architecture, thereby enabling for parallel computation and system scalability.
The WRF model is a mesoscale model developed by a group of scientists from different institutes and lefts, such as the National Center for Atmospheric Research (NCAR), National Centers for Environmental Prediction (NCEP), National Oceanic and Atmospheric Administration (NOAA), and a number of other collaborating institutes and universities. The WRF is a fully compressible nonhydrostatic three-dimensional (3D) primitive equation model that is designed for simulating atmospheric phenomena across scales. These scales varies from large eddies (∼100 m) to mesoscale circulations and waves (from ∼100 m to >1000 km).
The WRF system provides different physics options for cloud parameterization, planetary boundary layer (PBL) turbulence physics, atmospheric radiation, and land surface models (LSMs). It also incorporates various initialization routines and data assimilation techniques that numerous weather agencies and research lefts have extensively tested. Additional manuals and descriptions of the WRF model are fully documented in [51,52].

2.2.3. Hybrid Model

In order to improve the prediction process, the machine-learning models can use the output of the numerical models. In this research, a hybrid model is designed by using known weather data and the estimated future data of solar radiation energy obtained by the WRF model to build a new hybrid model for short-term solar radiation energy. Therefore, a Gaussian Process Regression (GPR) model is utilized to do this job. Actually, the GPR models are probabilistic ones with non-parametric kernel-based structures [49]. The proposed hybrid model is highlighted along with a corrector model that enhances the prediction values of the WRF model by using the machine learning of the GPR.
Figure 4 illustrates a high-level structure of the hybrid model that uses two types of input data. The first input data are the historical solar features including the values of GHI of the m previous days, denoted by x 1 , x 2 , , x m . The other input data are the predicted value of GHI on the considered day for prediction, denoted by y. Then, the GPR model uses these inputs to predict a new GHI value.
Specifically, the designed GPR model predicts new GHI values, which are expected to be more accurate than the ones predicted by the WRF models. Consider the input vector X = ( x 1 , x 2 , , x m , y ) , then a new GHI value y can be computed from the following regression model:
y = X T β + ϵ ,
where ϵ is generated for the normal distribution N ( 0 , σ 2 ) , and β , σ are estimated from the training data [49]. In order to deal with the non-linearity, a kernel-based structure can be used to modify the above-mentioned model to be:
y = h ( X ) T β + f ( X ) ,
where h ( · ) are basis transformation functions and f ( · ) is a Gaussian function [49].

3. Experimental Setup and Evaluation

Available observed historical data are exploited in order to measure the performance of the reduction in the input feature space. In particular, the impact of dimension reduction on the solar radiation estimation process is investigated. This investigation is conducted by measuring weather data variables, such as temperature, wind speed, humidity and direct normal irradiance, as well as other environmental data. Table 1 enumerates the attributes used for evaluation purposes. Evaluation of the proposed system is performed by setting the GHI for the current day as the objective output.
An experiment to evaluate the discrete energy class prediction with and without feature selection is designed. For compliance with typical classification frameworks, the GHI measurements are descretized into a finite number of levels. The range of the recorded GHI expands between 0 and 9000. Two discrete sets are generated. The first set is called 5-class, which comprises five levels of GHI values. Each level contains approximately 2000 values of GHI. Similarly, the other set, 10-class, contains ten different classes representing ten discrete GHI levels of approximately 1000 for each one.
In order to measure the candidate reduction of the input feature space, three data sets, which were collected at distant stations distributed around Saudi Arabia, are used. As shown in Table 2, the locations of these stations exhibit diverse climatic conditions. The diversity of these cities in terms of locations and topographies has supported the choice of these cities for our experimentation. Furthermore, the research nature of the installed stations in these cities makes it easy to obtain the necessary solar data. King Abdullah City for Atomic and Renewable Energy (KACARE) has installed and monitors these stations under the Renewable Resource Monitoring and Mapping (RRMM) Program [53,54]. The main weather measurement that is used for evaluation is the GHI. The data sets are collected for three Saudi cities on a daily basis from mid-2013 to the end of 2014. Comprehensive evaluation is performed using these data sets. However, because of technical issues with some of these recently-installed KACARE stations, two important readings are missing during this period: visibility and sky cover parameters. This is apart from the obvious uncertainty associated with all other measurements. Therefore, to overcome this issue, another source to obtain the visibility variable data—the Presidency of Meteorology and Environment stations is used. As depicted in Table 1, only two cities out of the used three have the visibility parameter recorded.
In order to evaluate the GHI prediction performance of the proposed hybrid learning model, which is used for regression in this case, other data sets were selected to cover four different levels of challenge: clear, cloudy, dusty, and dusty-cloudy [8]. Table 3 [8] presents the challenging cases, including 41 dust storms. The solar attributes are collected on those dates and the following days thereby leading to the creation of data sets with 81 records at three stations KAU, QU, and TU. If one day after the storm is recorded in addition to the storm days, it should add up to 82 days rather than 81. There is one day missing because one of the storms lasted for two consecutive days. The prediction performance is done by feeding measurements of preceding days to the regression process in order to predict GHI in these specific 81 days.
These 41 cases reveal a clear seasonality changes in the observed frequency of dust storms during 2014. The highest frequency of events are during the spring and summer (March–August), whereas the lowest number of dust storms events took place in the autumn and winter (September–November). More details are found in [55].
The simulations of the severe dust storm events over Saudi Arabia are performed using the WRF with the dynamic core of the Advanced Research WRF (ARW). The WRF model provides two-day hourly forecasting for surface solar irradiance for specific cases in 2014. The atmospheric dust aerosol is indirect data that is highly correlated with the solar radiation at the surface. Specifically, an increase in atmospheric aerosol dust will immediately turn into solar irradiance reduction on the surface. Consequently, improving the aerosol forecasting leads to more accurate prediction of surface solar radiation.
For the solar irradiance prediction process, the forceasting data of the National Centers for Environmental Prediction (NCEP), which follows the Global Forecast System (GFS) model [56], is used. As a preprocessing step, these GFS forecasts are downscaled both spatially and temporally. Four daily samples of NCEP GFS are given at 0 UTC, 6 UTC, 12 UTC, and 18 UTC. The temporal and spatial resolutions are three hours and 0 . 5 × 0 . 5 , respectively. The forecast accuracy evaluation is performed by comparing the GHI forecasts of WRF with the obtained ground measurements. Land cover and elevation and land cover data were obtained from the digital terrain model of the United States Geological Survey [57].
In order to represent different weather conditions, simulations of the aforementioned cases were obtained using non-hydrostatic WRF-ARW mesoscale model (version 3.7.1). These simulations were based on the NCEP GFS. Two nested domains are included in the model configuration, as depicted in Figure 5. Unevenly spaced vertical levels are used with 27 km and 9 km of grid spacing for the coarser grid of domain 1 and for domains 2 and 45, respectively. In the evaluation procedure, estimates corresponding to domain two grid points that enclose the experimental radiometric stations are used. Two-day with one-hour resolution forecasting simulations were performed on a daily basis. The starting point was ate midnight UTC. The two-way nesting option between domains 1 and 2 was selected to allow the grids to interact in both directions.
In this work, the used a scheme known as Grell convective scheme, which represents an advanced version of the Grell-Devenyi ensemble convection scheme [58]. The rapid radiative transfer scheme (RRTM) is selected to control parameterization for long-wave radiation [59]. The RRTM scheme represents the influences of the detailed absorption spectrum, accounting for carbon dioxide, ozone, and water vapor as well as a scheme for short-wave radiation [60] and PBL scheme [61] of Yonsei University (YSU).
Specific days of the year, with distinct sky conditions, are selected to analyze the performance of the WRF model. The main objective of such selection is to evaluate the model’s forecasting accuracy under different meteorological conditions. Therefore, the condition of the sky is the main basis of the analysis. In particular, four different daily scenarios are considered: clear sky, cloudy, dusty, and dusty-cloudy. From an operational perspective, it is more practical to forecast on a day-ahead mean basis than an hour-ahead basis.

4. Results and Discussion

In this section, the discussion of the obtained results is presented in the context of the main hypothetical questions raised in this paper. Particularly, the first question of which prediction paradigm performed the best: numerical, machine learning, or hybrid. The second question is related to investigating the impact of feature space dimension-reduction on the prediction performance. The later aspect has been considered partly in an earlier paper [62]. However, an extended version is presented here by considering the hybrid model for regression of real-time non-discretized data.

4.1. Feature Selection Results

One of the main aims of this paper is to optimize the input feature space. Before discussing the proposed models, the impact of the proposed feature-space reduction on efficiency is emphasized. Specifically, a discussion is raised regarding the potential effects of this reduction based on the γ -values. The following three different forms of the output space are considered:
  • A continuous real-number space;
  • A 5-class decision space;
  • A 10-class decision space.
Figure 6 illustrates the precision independence of the input attributes when fed individually to the classifier—that is a single-reduct input space. Every one of these three figures displays the γ -values separately for each of the three output spaces listed above, for each single attribute. From these figures, it is clear that the D H and D N attributes yield the best γ -values, then H. W S and P W S have the worst γ -values.
Figure 7, Figure 8 and Figure 9 illustrate how dual-attribute reductions function in relation to the γ -values. The left diagonals in these figures reflect the top view of Figure 6. The non-linearity of the input data is proved here. This implies that the best quality is not always achieved when effective cuts are taken individually, and vice versa. This reflects the complex nature of the problem under consideration herein.
More comprehensive results for real, 5-class, and 10-class data are presented in Table 4 and Table 5. The 5-class and 10-class results are combined in Table 5, as both results are similar. These results indicate how a combination of a very low- γ single-reduct attribute and other attributes may provide better prediction quality than a combination of good single reducts. In the KAU case, for example, combining H and D H with P W S boosted the value of γ to 100%. A similar effect appears when the low P attribute is combined with other attributes.

4.2. Prediction Results with Classification

For prediction and energy level classification, a five-fold cross-validation evaluation scheme is followed. The results with reducts ignore the attributes that have not been selected in real values of the decision attribute, as shown in Table 4—for example attributes WD, WS, and PWS.
Table 6, Table 7 and Table 8 depict the class prediction results using the 35 prediction models that were discussed in Section 2.2. These prediction models are sorted based on the averages of their classification rates, and the best rate of each column is highlighted in bold. Generally, using reducts for reduced feature selection yields better class predication rates for most cases. The remarkable performance boost occurs with the cases that are originally affected by missing measurements. e.g., the discriminant analysis for TU datasets. It can be seen that the levels of classification vary according to the model used, with advantage to the results to the support vector machines and neural networks models.

4.3. Prediction Results with Regression

Two solar forecasting experiments were carried out using regression models. In the first experiment, the GPR models are applied to the KAU, QU and TU datasets. The main results of this experiment are presented in Figure 10, Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15 and Table 9. The predicted solar irradiance versus the real values for the invoked datasets with and without reduction are shown in Figure 10, Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15. These figures indicate how promise the proposed regression models are, especially the ones with reduction. Table 9 shows this by comparing the error values between the two models using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and R-Squared measures. Given the values of the prediction and observation data, y i and ζ i , i = 1 , , n , respectively, these comparative measures are computed as follows:
e i = ( y i ζ i ) ,
RMSE = 1 n i = 1 n e i 2 ,
MAE = 1 n i = 1 n | e i | ,
y ¯ = 1 n i = 1 n y i ,
R - Squared = 1 i = 1 n e i 2 i = 1 n ( y i y ¯ ) 2 ,
The RMSE and MAE are calculated to estimate the errors of computing the GHI values in W/m2, as explained as follows. The comparison in Table 9 highlights the success of the reduction models in obtaining better results in two cases, and the results were very close in the third one.
In the other regression experiment, the machine learning (GPR), numerical (WRF), and hybrid (GPR with WRF data) prediction models are applied on datasets KAU, QU, and TU with dust storms. The daily solar features including the GHI values of the current and previous days are used to predict the value of the GHI on the next day. Root mean square errors are computed and reported using the five-fold cross-validation criterion, as depicted in Table 10. The conclusion obtained from this result set implies that the use of machine learning generally improves the results. The proposed hybrid model gives the best results in most cases. Even with the only exception, with TU dataset, the recorded error using the hybrid model is slightly larger than GPR. The second conclusion from these results is that feature reduction does not help a lot with the regression process. Figure 16, Figure 17 and Figure 18 reveal detailed class prediction results using the confusion matrix principle, which are the collection of predicted and actual classification information carried out by the best classifiers. In most cases, the classification failure occurs with neighboring classes.
Figure 19, Figure 20 and Figure 21 comprehensively reveal the detailed deviations between the true and the predicted GHI values. The hybrid model performs well in most cases. There are some days where the error between true and predicted values is large. This is because these days usually follows cloudy ones whose mostly inaccurate measurements.

5. Conclusions

This paper presents hybrid machine learning approaches for solar radiation estimation by utilizing numerical methods. The numerical models, particularly the WRF models, are widely used in forecasting weather data. One of the main achievements of this paper is to show the extent to which the use of machine learning models can improve predictions for numerical methods. This has been achieved by building hybrid models through several layers of methodology design. First, a feature selection and dimensionality reduction approach was proposed for parameters associated with solar radiation estimation. The proposed attribute reduction is based on using an adaptive memory programming approach to optimize the input feature space of a solar radiation model. Then, different classification models are used to predict the solar radiation classes. The proposed methodologies are evaluated using a real environmental temporal dataset collected from diverse regions in Saudi Arabia. The feature selection has played an important role in increasing the class prediction rates. The class prediction rates increased, after using feature selection, by values up to 8.5 % 15 % depending on the used classifier and the considered test region. Finally, the WRF data were used in the proposed regression models to obtain improved prediction results that are generally better than the predictions of pure machine learning and WRF models. The prediction improvements of the average root mean square error reached up to 5.6 % and in the mean absolute error values up to 8.3 % . The obtained results proved the effectiveness of the proposed hybrid model in improving the prediction of the GHI values. The hybrid models could reduce the root mean square errors by 70.2% and 4.3% than the numerical and machine learning models, respectively, when these models are applied to some dataset. For some reduced feature dataset, the hybrid models could decrease the root mean square errors by 47.3% and 14.4% than the numerical and machine learning models, respectively. For discrete classes, attribute reduction, which combines few low-dependency degree single-reduct attributes with other attributes, results in very good quality solutions. On the other side, attribute reduction did not contribute much to performance improvement. when discretization is used with the input data classes.

Author Contributions

Conceptualization, A.-R.H., M.A. (Majid Almaraashi), A.E.A.-H. and M.A. (Mahmoud Abdulrahim); methodology, A.-R.H., M.A. (Majid Almaraashi), A.E.A.-H. and M.A. (Mahmoud Abdulrahim); programming and implementation, A.-R.H., M.A. (Majid Almaraashi), A.E.A.-H. and M.A. (Mahmoud Abdulrahim); writing—original draft preparation, A.-R.H., M.A. (Majid Almaraashi), A.E.A.-H. and M.A. (Mahmoud Abdulrahim); writing—review and editing, A.-R.H., M.A. (Majid Almaraashi), A.E.A.-H. and M.A. (Mahmoud Abdulrahim); funding acquisition, M.A. (Majid Almaraashi). All authors have read and agreed to the published version of the manuscript.

Funding

This work is part of a project funded by King Abdulaziz City for Science and Technology (KACST) with grant number 13-ENES2373-10.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The authors would like to thank King Abdulaziz City for Science and Technology—the Kingdom of Saudi Arabia, for supporting the project number (13-ENES2373-10). In addition, the authors would like to acknowledge King Abdullah City for Atomic and Renewable Energy (KACARE) for supplying data from the stations.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cline, W.R. The Economics of Global Warming; Peterson Institute for International Economics: Washington, DC, USA, 1992. [Google Scholar]
  2. Riordan, C.; Hulstrom, R.; Cannon, T.; Myers, D. Solar radiation research for photovoltaic applications. Sol. Cells 1991, 30, 489–500. [Google Scholar] [CrossRef]
  3. Ibrahim, I.A.; Khatib, T. A novel hybrid model for hourly global solar radiation prediction using random forests technique and firefly algorithm. Energy Convers. Manag. 2017, 138, 413–425. [Google Scholar] [CrossRef]
  4. Muneer, T.; Younes, S.; Munawwar, S. Discourses on solar radiation modeling. Renew. Sustain. Energy Rev. 2007, 11, 551–602. [Google Scholar] [CrossRef]
  5. Şen, Z. Solar energy in progress and future research trends. Prog. Energy Combust. Sci. 2004, 30, 367–416. [Google Scholar] [CrossRef]
  6. Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
  7. Wang, L.; Kisi, O.; Zounemat-Kermani, M.; Salazar, G.A.; Zhu, Z.; Gong, W. Solar radiation prediction using different techniques: Model evaluation and comparison. Renew. Sustain. Energy Rev. 2016, 61, 384–397. [Google Scholar] [CrossRef]
  8. Abdulrahim, M.; Almaraashi, M. Forecasting of Short-Term Solar Radiation Based on a Numerical Weather Prediction Model over Saudi Arabia. In Proceedings of the 6th International Conference on Informatics, Environment, Energy and Applications, Jeju, Korea, 29–31 March 2017; pp. 16–19. [Google Scholar]
  9. Larson, V.E. Forecasting solar irradiance with numerical weather prediction models. Sol. Energy Forecast. Resour. Assess. 2013, 299–318. [Google Scholar] [CrossRef]
  10. Jimenez, P.A.; Hacker, J.P.; Dudhia, J.; Haupt, S.E.; Ruiz-Arias, J.A.; Gueymard, C.A.; Deng, A. WRF-Solar: Description and clear-sky assessment of an augmented NWP model for solar power prediction. Bull. Am. Meteorol. Soc. 2016, 97, 1249–1264. [Google Scholar] [CrossRef]
  11. Şen, Z. Fuzzy algorithm for estimation of solar irradiation from sunshine duration. Sol. Energy 1998, 63, 39–49. [Google Scholar] [CrossRef]
  12. Bhardwaj, S.; Sharma, V.; Srivastava, S.; Sastry, O.S.; Bandyopadhyay, B.; Chel, S.S.; Gupta, J.R.P. Estimation of solar radiation using a combination of Hidden Markov Model and generalized Fuzzy model. Sol. Energy 2013, 93, 43–54. [Google Scholar] [CrossRef]
  13. Mellit, A.; Arab, A.H.; Khorissi, N.; Salhi, H. An ANFIS-based Forecasting for Solar Radiation Data from Sunshine Duration and Ambient Temperature. In Proceedings of the 2007 IEEE Power Engineering Society General Meeting, Tampa, FL, USA, 24–28 June 2007; pp. 1–6. [Google Scholar] [CrossRef]
  14. Alobaidi, M.H.; Marpu, P.R.; Ouarda, T.B.M.J.; Ghedira, H. Mapping of the Solar Irradiance in the UAE Using Advanced Artificial Neural Network Ensemble. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 3668–3680. [Google Scholar] [CrossRef]
  15. Mellit, A.; Pavan, A.M. A 24-h forecast of solar irradiance using artificial neural network: Application for performance prediction of a grid-connected PV plant at Trieste, Italy. Sol. Energy 2010, 84, 807–821. [Google Scholar] [CrossRef]
  16. Khatib, T.; Mohamed, A.; Mahmoud, M.; Sopian, K. Modeling of Daily Solar Energy on a Horizontal Surface for Five Main Sites in Malaysia. Int. J. Green Energy 2011, 8, 795–819. [Google Scholar] [CrossRef]
  17. de Freitas Viscondi, G.; Alves-Souza, S.N. Solar Irradiance Prediction with Machine Learning Algorithms: A Brazilian Case Study on Photovoltaic Electricity Generation. Energies 2021, 14, 5657. [Google Scholar] [CrossRef]
  18. Takamatsu, T.; Ohtake, H.; Oozeki, T.; Nakaegawa, T.; Honda, Y.; Kazumori, M. Regional Solar Irradiance Forecast for Kanto Region by Support Vector Regression Using Forecast of Meso-Ensemble Prediction System. Energies 2021, 14, 3245. [Google Scholar] [CrossRef]
  19. Alzahrani, A.; Shamsi, P.; Dagli, C.; Ferdowsi, M. Solar irradiance forecasting using deep neural networks. Procedia Comput. Sci. 2017, 114, 304–313. [Google Scholar] [CrossRef]
  20. Gbémou, S.; Eynard, J.; Thil, S.; Guillot, E.; Grieu, S. A Comparative Study of Machine Learning-Based Methods for Global Horizontal Irradiance Forecasting. Energies 2021, 14, 3192. [Google Scholar] [CrossRef]
  21. Aslam, M.; Lee, J.M.; Kim, H.S.; Lee, S.J.; Hong, S. Deep learning models for long-term solar radiation forecasting considering microgrid installation: A comparative study. Energies 2020, 13, 147. [Google Scholar] [CrossRef] [Green Version]
  22. Chandola, D.; Gupta, H.; Tikkiwal, V.A.; Bohra, M.K. Multi-step ahead forecasting of global solar radiation for arid zones using deep learning. Procedia Comput. Sci. 2020, 167, 626–635. [Google Scholar] [CrossRef]
  23. Mukhoty, B.P.; Maurya, V.; Shukla, S.K. Sequence to sequence deep learning models for solar irradiation forecasting. In Proceedings of the 2019 IEEE Milan PowerTech, Milan, Italy, 23–27 June 2019; pp. 1–6. [Google Scholar]
  24. Jayalakshmi, N.Y.; Shankar, R.; Subramaniam, U.; Baranilingesan, I.; Stalin, A.K.B.; Rahim, R.; Ghosh, A. Novel Multi-Time Scale Deep Learning Algorithm for Solar Irradiance Forecasting. Energies 2021, 14, 2404. [Google Scholar] [CrossRef]
  25. De Araujo, J.M.S. Performance comparison of solar radiation forecasting between WRF and LSTM in Gifu, Japan. Environ. Res. Commun. 2020, 2, 045002. [Google Scholar] [CrossRef]
  26. Husein, M.; Chung, I.Y. Day-ahead solar irradiance forecasting for microgrids using a long short-term memory recurrent neural network: A deep learning approach. Energies 2019, 12, 1856. [Google Scholar] [CrossRef] [Green Version]
  27. Ghimire, S.; Deo, R.C.; Raj, N.; Mi, J. Deep solar radiation forecasting with convolutional neural network and long short-term memory network algorithms. Appl. Energy 2019, 253, 113541. [Google Scholar] [CrossRef]
  28. Qing, X.; Niu, Y. Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM. Energy 2018, 148, 461–468. [Google Scholar] [CrossRef]
  29. Yu, Y.; Cao, J.; Zhu, J. An LSTM short-term solar irradiance forecasting under complicated weather conditions. IEEE Access 2019, 7, 145651–145666. [Google Scholar] [CrossRef]
  30. Taylor, S.J.; Letham, B. Forecasting at scale. Am. Stat. 2018, 72, 37–45. [Google Scholar] [CrossRef]
  31. Taylor, S.J.; Letham, B. Prophet: Automatic Forecasting Procedure. Available online: https://github.com/facebook/prophet (accessed on 17 November 2021).
  32. Snow, D. AtsPy: Automated Time Series Models in Python. Available online: https://github.com/firmai/atspy/ (accessed on 17 November 2021).
  33. Abdel-Nasser, M.; Mahmoud, K.; Lehtonen, M. Reliable Solar Irradiance Forecasting Approach Based on Choquet Integral and Deep LSTMs. IEEE Trans. Ind. Inform. 2020, 17, 1873–1881. [Google Scholar] [CrossRef]
  34. Almaraashi, M. Short-term prediction of solar energy in Saudi Arabia using automated-design fuzzy logic systems. PLoS ONE 2017, 12, e0182429. [Google Scholar] [CrossRef] [Green Version]
  35. Almaraashi, M. Investigating the impact of feature selection on the prediction of solar radiation in different locations in Saudi Arabia. Appl. Soft Comput. 2018, 66, 250–263. [Google Scholar] [CrossRef]
  36. Voyant, C.; Muselli, M.; Paoli, C.; Nivet, M.L. Numerical weather prediction (NWP) and hybrid ARMA/ANN model to predict global radiation. Energy 2012, 39, 341–355. [Google Scholar] [CrossRef] [Green Version]
  37. Boubaker, S.; Benghanem, M.; Mellit, A.; Lefza, A.; Kahouli, O.; Kolsi, L. Deep Neural Networks for Predicting Solar Radiation at Hail Region, Saudi Arabia. IEEE Access 2021, 9, 36719–36729. [Google Scholar] [CrossRef]
  38. Hedar, A.R.; Wang, J.; Fukushima, M. Tabu search for attribute reduction in rough set theory. Soft Comput. 2008, 12, 909–918. [Google Scholar] [CrossRef]
  39. Pawlak, Z. Rough Sets: Theoretical Aspects of Reasoning about Data; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 9. [Google Scholar]
  40. Loh, W.Y. Classification and regression trees. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
  41. Li, T.; Zhu, S.; Ogihara, M. Using discriminant analysis for multi-class classification: An experimental investigation. Knowl. Inf. Syst. 2006, 10, 453–472. [Google Scholar] [CrossRef]
  42. Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar] [CrossRef]
  43. Hsu, C.W.; Chang, C.C.; Lin, C.J. A Practical Guide to Support Vector Classification; Technical Report; Department of Computer Science, National Taiwan University: Taipei, Taiwan, 2003. [Google Scholar]
  44. Hastie, T.; Tibshirani, R. Discriminant adaptive nearest neighbor classification and regression. Adv. Neural Inf. Process. Syst. 1996, 8, 409–415. [Google Scholar]
  45. Veenman, C.J.; Reinders, M.J. The nearest subclass classifier: A compromise between the nearest mean and nearest neighbor classifier. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1417–1429. [Google Scholar] [CrossRef]
  46. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  47. Graupe, D. Principles of Artificial Neural Networks; World Scientific: Singapore, 2013; Volume 7. [Google Scholar]
  48. Rasmussen, C.E. Gaussian processes in machine learning. In Advanced Lectures on Machine Learning; Lecture Notes in Computer Science; Bousquet, O., von Luxburg, U., Rätsch, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2003; Volume 3176. [Google Scholar]
  49. Williams, C.K.; Rasmussen, C.E. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
  50. Skamarock, W.C.; Klemp, J.B.; Dudhia, J.; Gill, D.O.; Barker, D.M.; Wang, W.; Powers, J.G. A Description of the Advanced Research WRF Version 3. NCAR Technical Note-475 + STR; NCAR Technical Note; University Corporation for Atmospheric Research; Citeseer: University Park, PA, USA, 2008. [Google Scholar] [CrossRef]
  51. Skamarock, W.C.; Klemp, J.B.; Dudhia, J.; Gill, D.O.; Barker, D.M.; Wang, W.; Powers, J.G. A Description of the Advanced Research WRF Version 2; Technical Report; Mesoscale and Microscale Meteorology at National Center for Atmospheric Research: Boulder, CO, USA, 2005. [Google Scholar]
  52. Michalakes, J.; Dudhia, J.; Gill, D.; Henderson, T.; Klemp, J.; Skamarock, W.; Wang, W. The weather research and forecast model: Software architecture and performance. In Use of High Performance Computing in Meteorology; World Scientific: Singapore, 2005; pp. 156–168. [Google Scholar]
  53. KACARE. Renewable Resource Atlas. 2015. Available online: http://rratlas.energy.gov.sa (accessed on 1 October 2015).
  54. Zell, E.; Gasim, S.; Wilcox, S.; Katamoura, S.; Stoffel, T.; Shibli, H.; Engel-Cox, J.; Subie, M.A. Assessment of solar radiation resources in Saudi Arabia. Sol. Energy 2015, 119, 422–438. [Google Scholar] [CrossRef] [Green Version]
  55. Notaro, M.; Alkolibi, F.; Fadda, E.; Bakhrjy, F. Trajectory analysis of Saudi Arabian dust storms. J. Geophys. Res. Atmos. 2013, 118, 6028–6043. [Google Scholar] [CrossRef]
  56. GFS. Global Forecast System Model. 2020. Available online: https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast (accessed on 1 February 2020).
  57. USGS. U.S. Geological Survey. 2020. Available online: https://www.usgs.gov/ (accessed on 1 February 2020).
  58. Grell, G.A.; Dévényi, D. A generalized approach to parameterizing convection combining ensemble and data assimilation techniques. Geophys. Res. Lett. 2002, 29, 38-1–38-4. [Google Scholar] [CrossRef] [Green Version]
  59. Mlawer, E.J.; Taubman, S.J.; Brown, P.D.; Iacono, M.J.; Clough, S.A. Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res. Atmos. 1997, 102, 16663–16682. [Google Scholar] [CrossRef] [Green Version]
  60. Dudhia, J. Numerical study of convection observed during the winter monsoon experiment using a mesoscale two-dimensional model. J. Atmos. Sci. 1989, 46, 3077–3107. [Google Scholar] [CrossRef]
  61. Hong, S.-Y.; Noh, Y.; Dudhia, J. A new vertical diffusion package with an explicit treatment of entrainment processes. Mon. Weather Rev. 2006, 134, 2318–2341. [Google Scholar] [CrossRef] [Green Version]
  62. Hedar, A.R.; Abdel-Hakim, A.E.; Almaraashi, M. Granular-based dimension reduction for solar radiation prediction using adaptive memory programming. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, Denver, CO, USA, 20–24 June 2016; pp. 929–936. [Google Scholar]
Figure 1. The flowchart of the proposed solar prediction models.
Figure 1. The flowchart of the proposed solar prediction models.
Energies 14 07970 g001
Figure 2. The flowchart of the proposed feature selection mTSAR method.
Figure 2. The flowchart of the proposed feature selection mTSAR method.
Energies 14 07970 g002
Figure 3. Reducing the cardinality of the best solution using the shaking procedure.
Figure 3. Reducing the cardinality of the best solution using the shaking procedure.
Energies 14 07970 g003
Figure 4. The layout of the hybrid prediction regression model.
Figure 4. The layout of the hybrid prediction regression model.
Energies 14 07970 g004
Figure 5. Successive nested domains for model configuration.
Figure 5. Successive nested domains for model configuration.
Energies 14 07970 g005
Figure 6. γ -values of reducts with a single attribute using real-value, 5-class and 10-class data.
Figure 6. γ -values of reducts with a single attribute using real-value, 5-class and 10-class data.
Energies 14 07970 g006
Figure 7. The real values of the decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Figure 7. The real values of the decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Energies 14 07970 g007
Figure 8. The 5-class decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Figure 8. The 5-class decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Energies 14 07970 g008
Figure 9. The 10-class decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Figure 9. The 10-class decision attribute: Distributions of γ -values of reducts with dual and single attributes.
Energies 14 07970 g009
Figure 10. The predicted solar irradiance versus the real values for KAU dataset without reduction.
Figure 10. The predicted solar irradiance versus the real values for KAU dataset without reduction.
Energies 14 07970 g010
Figure 11. The predicted solar irradiance versus the real values for KAU dataset with reduction.
Figure 11. The predicted solar irradiance versus the real values for KAU dataset with reduction.
Energies 14 07970 g011
Figure 12. The predicted solar irradiance versus the real values for QU dataset without reduction.
Figure 12. The predicted solar irradiance versus the real values for QU dataset without reduction.
Energies 14 07970 g012
Figure 13. The predicted solar irradiance versus the real values for QU dataset with reduction.
Figure 13. The predicted solar irradiance versus the real values for QU dataset with reduction.
Energies 14 07970 g013
Figure 14. The predicted solar irradiance versus the real values for TU dataset without reduction.
Figure 14. The predicted solar irradiance versus the real values for TU dataset without reduction.
Energies 14 07970 g014
Figure 15. The predicted solar irradiance versus the real values for TU dataset with reduction.
Figure 15. The predicted solar irradiance versus the real values for TU dataset with reduction.
Energies 14 07970 g015
Figure 16. Distributions of KAU predicted classes.
Figure 16. Distributions of KAU predicted classes.
Energies 14 07970 g016
Figure 17. Distributions of QU predicted classes.
Figure 17. Distributions of QU predicted classes.
Energies 14 07970 g017
Figure 18. Distributions of TU predicted classes.
Figure 18. Distributions of TU predicted classes.
Energies 14 07970 g018
Figure 19. Prediction models for KAU with the dust storm dataset.
Figure 19. Prediction models for KAU with the dust storm dataset.
Energies 14 07970 g019
Figure 20. Prediction models for QU with the dust storm dataset.
Figure 20. Prediction models for QU with the dust storm dataset.
Energies 14 07970 g020
Figure 21. Prediction models for TU with the dust storm dataset.
Figure 21. Prediction models for TU with the dust storm dataset.
Energies 14 07970 g021
Table 1. Solar attributes used in the current experiment.
Table 1. Solar attributes used in the current experiment.
AttributesAbbreviationKAUQUTU
Air Temperature (Degrees C)T
Average Wind Direct at 3 m (Deg North)WD
Average Wind Speed at 3 m (m/s)WS
Diffuse Horizontal Irradiance (Wh/m2)DH
Direct Normal Irradiance (Wh/m2)DN
Peak Wind Speed at 3 m (m/s)PWS
Relative Humidity (Percent)H
Station Pressure (mB (hPa equivalent))P
VisibilityV×
Table 2. The data of the stations and their recorded measurements.
Table 2. The data of the stations and their recorded measurements.
StationCityLatitude (N)Longitude (E)Elevation (m)Data Samples
King Abdulaziz Univ. (KAU)Jeddah21.4960439.2449275582
Qassim Univ. (QU)Qassim26.3466843.76645688576
Taif Univ. (TU)Taif21.4327840.491731518575
Table 3. Dates of widespread dust storm events covering different areas of Saudi Arabia on 2014.
Table 3. Dates of widespread dust storm events covering different areas of Saudi Arabia on 2014.
MonthDays
January19, 27
February24
March3, 9, 12, 16, 24, 27, 31
April1, 3, 11, 15, 19, 27, 30
May3, 7, 10, 13, 19, 23
June5, 12, 16, 18, 22
July5, 9, 13, 18, 20, 31
August18
October9, 11, 14, 16, 21
November4
Table 4. The real values of the decision attribute: The best reducts for five independent runs.
Table 4. The real values of the decision attribute: The best reducts for five independent runs.
DatasetAttributes in the Best ReductsReductReduct
TWDWSDHDNPWSHPVSizeQuality
KAU 2100%
QU 399.65%
399.31%
297.92%
297.92%
TU 199.83%
299.83%
192.17%
Table 5. The 5-class or the 10-class decision attribute: The best reducts for five independent runs.
Table 5. The 5-class or the 10-class decision attribute: The best reducts for five independent runs.
DatasetAttributes in the Best ReductsReductReduct
TWDWSDHDNPWSHPVSizeQuality
KAU 2100%
3100%
QU 399.65%
299.31%
497.92%
497.92%
TU 199.83%
Table 6. Class prediction rates of the KAU datasets.
Table 6. Class prediction rates of the KAU datasets.
Classifiers5 Classes10 ClassesAverages
Without ReductsWith ReductsWithout ReductsWith Reducts
Optimizable SVM 88 . 1 % 89.3 % 78 . 9 % 81 . 6 % 84.5 %
Linear Discriminate Analysis 87.6 % 88.1 % 76.6 % 76.8 % 82.3 %
Optimizable Discriminate Analysis 87.6 % 88.1 % 76.6 % 76.8 % 82.3 %
Quadratic SVM 86.9 % 89 . 7 % 73.0 % 78.4 % 82.0 %
Linear SVM 87.5 % 88.8 % 72.3 % 75.1 % 80.9 %
Trilayered Neural Networks 85.4 % 85.6 % 73.5 % 77.7 % 80.6 %
Wide Neural Networks 87.8 % 88.7 % 75.1 % 70.3 % 80.5 %
Optimizable KNN Classifiers 85.6 % 86.3 % 74.1 % 74.9 % 80.2 %
Bilayered Neural Networks 84.7 % 87.1 % 71.6 % 77.0 % 80.1 %
Narrow Neural Networks 85.6 % 86.6 % 71.8 % 75.6 % 79.9 %
Medium Neural Networks 87.6 % 87.1 % 72.3 % 68.7 % 78.9 %
Cubic SVM 83.2 % 88.7 % 68.6 % 73.5 % 78.5 %
Ensemble Subspace Discriminate Analysis 82.0 % 82.6 % 71.8 % 68.7 % 76.3 %
Medium Gaussian SVM 80.4 % 83.3 % 60.7 % 65.8 % 72.6 %
Weighted KNN 76.5 % 81.3 % 59.1 % 64.6 % 70.4 %
Optimizable Ensemble Classifiers 76.3 % 79.4 % 63.4 % 62.4 % 70.4 %
Fine KNN 74.2 % 78.9 % 57.6 % 62.9 % 68.4 %
Ensemble Bagged Decision Trees 70.4 % 72.3 % 62.2 % 62.2 % 66.8 %
Cubic KNN 76.3 % 78.2 % 54.3 % 58.2 % 66.8 %
Optimizable Decision Tree 73.4 % 75.4 % 57.9 % 59.8 % 66.6 %
Medium KNN 71.8 % 79.2 % 53.6 % 59.5 % 66.0 %
Ensemble Boosted Decision Trees 76.5 % 75.9 % 55.0 % 56.2 % 65.9 %
Cosine KNN 71.5 % 75.3 % 54.0 % 59.2 % 65.0 %
Fine Decision Tree 70.6 % 72.7 % 57.2 % 57.9 % 64.6 %
Ensemble Subspace KNN 74.7 % 67.4 % 61.9 % 49.8 % 63.5 %
Fine Gaussian SVM 68.9 % 78.7 % 45.5 % 60.1 % 63.3 %
Coarse Gaussian SVM 73.4 % 74.9 % 50.3 % 51.4 % 62.5 %
Ensemble RUS Boosted Decision Trees 72.9 % 73.4 % 50.3 % 51.4 % 62.0 %
Medium Decision Tree 68.0 % 68.2 % 52.2 % 50.2 % 59.7 %
Optimizable Naïve Bayes 70.1 % 68.7 % 50.7 % 48.6 % 59.5 %
Kernal Naïve Bayes 69.4 % 68.7 % 50.5 % 48.6 % 59.3 %
Coarse KNN 62.8 % 67.2 % 43.1 % 44.8 % 54.5 %
Coarse Decision Tree 61.7 % 61.7 % 40.5 % 40.5 % 51.1 %
Gaussian Naïve Bayes 60.1 % 58.1 % 40.5 % 39.2 % 49.5 %
Quadratic Discriminate Analysis 74.6 % 74.7 %
Table 7. Class prediction rates of the QU datasets.
Table 7. Class prediction rates of the QU datasets.
Classifiers5 Classes10 ClassesAverages
Without ReductsWith ReductsWithout ReductsWith Reducts
Optimizable SVM 86 . 2 % 93 . 9 % 77 . 9 % 86 . 4 % 86.1 %
Trilayered Neural Networks 84.7 % 93.4 % 73.5 % 82.1 % 83.4 %
Optimizable KNN Classifiers 86.2 % 91.1 % 74.2 % 81.2 % 83.2 %
Bilayered Neural Networks 85.4 % 90.9 % 74.4 % 81.5 % 83.1 %
Narrow Neural Networks 85.9 % 92.2 % 70.9 % 82.1 % 82.8 %
Wide Neural Networks 84.7 % 91.6 % 71.4 % 81.4 % 82.3 %
Medium Neural Networks 86.1 % 91.6 % 70.6 % 80.5 % 82.2 %
Linear SVM 83.1 % 92.9 % 69.7 % 82.9 % 82.2 %
Quadratic SVM 83.6 % 92.0 % 70.6 % 81.2 % 81.9 %
Cubic SVM 80.3 % 90.8 % 67.4 % 79.3 % 79.5 %
Optimizable Ensemble Classifiers 84.8 % 84.1 % 71.3 % 70.6 % 77.7 %
Medium Gaussian SVM 78.0 % 87.1 % 62.4 % 72.8 % 75.1 %
Ensemble Subspace Discriminate Analysis 78.6 % 79.1 % 67.9 % 66.0 % 72.9 %
Ensemble Bagged Decision Trees 79.4 % 82.2 % 64.1 % 65.3 % 72.8 %
Fine KNN 74.6 % 85.7 % 57.1 % 70.7 % 72.0 %
Weighted KNN 76.3 % 85.0 % 55.9 % 70.9 % 72.0 %
Ensemble Boosted Decision Trees 81.2 % 82.0 % 58.4 % 60.5 % 70.5 %
Optimizable Decision Tree 77.7 % 81.5 % 60.3 % 62.2 % 70.4 %
Cubic KNN 74.2 % 82.8 % 51.2 % 66.4 % 68.7 %
Ensemble Subspace KNN 79.3 % 73.3 % 69.5 % 51.7 % 68.5 %
Medium KNN 74.6 % 81.9 % 51.9 % 63.8 % 68.1 %
Cosine KNN 73.0 % 81.7 % 50.7 % 63.2 % 67.2 %
Fine Decision Tree 72.3 % 78.6 % 55.4 % 57.8 % 66.0 %
Fine Gaussian SVM 68.8 % 80.7 % 44.4 % 65.3 % 64.8 %
Medium Decision Tree 75.3 % 76.8 % 51.6 % 53.7 % 64.4 %
Coarse Gaussian SVM 72.0 % 77.0 % 50.0 % 52.4 % 62.9 %
Coarse KNN 66.9 % 70.0 % 45.8 % 48.4 % 57.8 %
Ensemble RUS Boosted Decision Trees 71.3 % 70.9 % 39.7 % 42.2 % 56.0 %
Coarse Decision Tree 66.2 % 65.5 % 46.0 % 46.0 % 55.9 %
Linear Discriminate Analysis 85.8 % 93.0 % 84.0 %
Optimizable Discriminate Analysis 87.8 % 93.0 % 84.0 %
Kernal Naïve Bayes 72.0 % 69.9 % 51.2 %
Optimizable Naïve Bayes 72.0 % 70.6 % 52.1 %
Gaussian Naïve Bayes 65.2 % 64.8 %
Quadratic Discriminate Analysis
Table 8. Class prediction rates of the TU datasets.
Table 8. Class prediction rates of the TU datasets.
Classifiers5 Classes10 ClassesAverages
Without ReductsWith ReductsWithout ReductsWith Reducts
Optimizable Ensemble Classifiers 84 . 5 % 79.4 % 66 . 9 % 61.3 % 73.0 %
Bilayered Neural Networks 63.8 % 88.2 % 52.8 % 74.4 % 69.8 %
Trilayered Neural Networks 60.5 % 85.7 % 51.9 % 76.1 % 68.6 %
Ensemble Boosted Decision Trees 76.7 % 77.0 % 59.8 % 60.1 % 68.4 %
Narrow Neural Networks 62.2 % 85.4 % 50.2 % 75.1 % 68.2 %
Wide Neural Networks 66.0 % 85.2 % 48.1 % 70.6 % 67.5 %
Medium Neural Networks 65.7 % 85.2 % 48.3 % 70.6 % 67.5 %
Ensemble Subspace Discriminate Analysis 77.0 % 79.4 % 51.0 % 61.5 % 67.2 %
Optimizable Decision Tree 74.9 % 75.3 % 59.1 % 59.1 % 67.1 %
Ensemble Bagged Decision Trees 74.0 % 76.7 % 55.1 % 58.7 % 66.1 %
Optimizable KNN Classifiers 62.9 % 88.0 % 41.8 % 69.5 % 65.6 %
Optimizable SVM 51.6 % 91 . 6 % 39.4 % 75.1 % 64.4 %
Quadratic SVM 49.0 % 89.9 % 42.0 % 76 . 5 % 64.4 %
Fine Decision Tree 68.3 % 74.2 % 54.4 % 57.7 % 63.7 %
Medium Decision Tree 69.0 % 73.5 % 55.4 % 56.4 % 63.6 %
Linear SVM 48.6 % 87.2 % 39.5 % 73.7 % 62.3 %
Cubic SVM 48.4 % 85.4 % 39.5 % 73.7 % 61.8 %
Ensemble Subspace KNN 70.4 % 77.0 % 46.3 % 51.0 % 61.2 %
Weighted KNN 58.7 % 81.4 % 38.3 % 61.7 % 60.0 %
Medium Gaussian SVM 46.9 % 85.5 % 38.3 % 66.4 % 59.3 %
Fine KNN 58.0 % 80.5 % 36.8 % 59.1 % 58.6 %
Medium KNN 57.7 % 80.0 % 37.1 % 58.0 % 58.2 %
Cubic KNN 57.1 % 78.7 % 37.1 % 57.1 % 57.5 %
Coarse Decision Tree 67.8 % 66.7 % 44.6 % 44.6 % 55.9 %
Cosine KNN 55.4 % 78.0 % 33.8 % 55.4 % 55.7 %
Coarse Gaussian SVM 43.7 % 79.4 % 32.9 % 53.8 % 52.5 %
Coarse KNN 53.3 % 71.4 % 26.3 % 46.3 % 49.3 %
Fine Gaussian SVM 38.3 % 79.6 % 7.0 % 48.1 % 43.3 %
Ensemble RUS Boosted Decision Trees 55.7 % 57.7 % 19.3 % 30.8 % 40.9 %
Kernal Naïve Bayes 69.2 % 48.8 %
Optimizable Naïve Bayes 69.2 % 48.8 %
Linear Discriminate Analysis 86.4 %
Optimizable Discriminate Analysis 86.4 %
Quadratic Discriminate Analysis
Gaussian Naïve Bayes
Table 9. Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R-Squared of GHI Values (in W/m2) using regression prediction models on KAU, QU and TU datasets.
Table 9. Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R-Squared of GHI Values (in W/m2) using regression prediction models on KAU, QU and TU datasets.
KAUQUTU
Without ReductsWith ReductsWithout ReductsWith ReductsWithout ReductsWith Reducts
RMSE124.76118.11146.79143.78170.34171.80
MAE93.06785.913105.110102.530126.200129.340
R-Squared0.990.990.990.990.980.98
Table 10. RMSE of GHI values (in W/m2) using different regression prediction models on dust storm datasets.
Table 10. RMSE of GHI values (in W/m2) using different regression prediction models on dust storm datasets.
Prediction ModelsKAUQUTU
Numerical Model (WRF)1412.061196.761354.16
Machine Learning Model (GPR)440.02737.70334.46
Hybrid Model421.15695.41365.43
Machine Learning Model with Feature Selection514.19645.37666.37
Hybrid Model with Feature Selection559.03631.13641.65
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hedar, A.-R.; Almaraashi, M.; Abdel-Hakim, A.E.; Abdulrahim, M. Hybrid Machine Learning for Solar Radiation Prediction in Reduced Feature Spaces. Energies 2021, 14, 7970. https://doi.org/10.3390/en14237970

AMA Style

Hedar A-R, Almaraashi M, Abdel-Hakim AE, Abdulrahim M. Hybrid Machine Learning for Solar Radiation Prediction in Reduced Feature Spaces. Energies. 2021; 14(23):7970. https://doi.org/10.3390/en14237970

Chicago/Turabian Style

Hedar, Abdel-Rahman, Majid Almaraashi, Alaa E. Abdel-Hakim, and Mahmoud Abdulrahim. 2021. "Hybrid Machine Learning for Solar Radiation Prediction in Reduced Feature Spaces" Energies 14, no. 23: 7970. https://doi.org/10.3390/en14237970

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop