Next Article in Journal
Comparison of Urban Climate Change Adaptation Plans in Selected European Cities from a Legal and Spatial Perspective
Previous Article in Journal
Who Needs Academic Campuses? Are There Advantages to Studying on an Academic Campus Considering the Experience of Online Teaching Five Years after COVID-19?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advancing Electricity Consumption Forecasts in Arid Climates through Machine Learning and Statistical Approaches

by
Abdalrahman Alsulaili
*,
Noor Aboramyah
,
Nasser Alenezi
and
Mohamad Alkhalidi
Civil Engineering Department, Kuwait University, P.O. Box 5969, Kuwait City 13060, Kuwait
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(15), 6326; https://doi.org/10.3390/su16156326
Submission received: 5 June 2024 / Revised: 18 July 2024 / Accepted: 22 July 2024 / Published: 24 July 2024

Abstract

:
This study investigated the impact of meteorological factors on electricity consumption in arid regions, characterized by extreme temperatures and high humidity. Statistical approaches such as multiple linear regression (MLR) and multiplicative time series (MTS), alongside the advanced machine learning method Extreme Gradient Boosting (XGBoost) were utilized to analyze historical consumption data. The models developed were rigorously evaluated using established measures such as the Coefficient of Determination ( R 2 ), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). The performance of the models was highly accurate, with regression-type models consistently achieving an R 2 greater than 0.9. Additionally, other metrics such as RMSE and MAPE demonstrated exceptionally low values relative to the overall data scale, reinforcing the models’ precision and reliability. The analysis not only highlights the significant meteorological drivers of electricity consumption but also assesses the models’ effectiveness in managing seasonal and irregular variations. These findings offer crucial insights for improving energy management and promoting sustainability in similar climatic regions.

1. Introduction

Electricity is undeniably crucial in today’s world, serving as a fundamental resource for multiple different industries and playing an important role in driving the economy. As a consequence, electricity consumption has increased rapidly due to the growing demand. According to the International Energy Agency (IEA), electricity consumption has increased by 130% from 1990 to 2019 [1]. IEA estimated that electricity consumption would exceed half of all total energy use by the year 2050 [2]. The increasing demand for electricity can be attributed to a variety of factors, such as population growth, urbanization, and industrial expansion [3,4]. Similarly, this increase is often linked to other increases such as increases in gross domestic product (GDP), global economic competition, and improvements in power efficiencies and technological advancements [5,6]. Hasanuzzaman et al. [7] anticipate a 49% rise in worldwide energy consumption during the period 2007–2035. Enerdata [8] reports that global electricity consumption increased by approximately 6% in 2021, which represents a 4.8% increase compared to the demand in 2019, following a 0.7% decline in 2020 due to the impact of the COVID-19 pandemic. Accordingly, there is a growing concern over the sustainability of electricity due to increasing consumption [3]. Recent findings by Nti et al. [9] indicate a significant rise in household electricity usage worldwide. Moreover, Proedrou [10] reports that residential buildings contribute to one-third of the European Union’s total electricity consumption. Similarly, in the Arab region, Krarti [11] found that residential buildings consumed nearly 800 TWh of the total energy in 2015, which is 50% of the energy demand. These numbers indicate a significant impact of climate change and global warming on electricity consumption. For instance, in 2021, the highest worldwide increase in energy demand was recorded at 1200 TWh, leading to a notable rise in CO2 emissions from the electrical industry [2]. Additionally, projections suggest that global heating energy consumption is expected to decrease by nearly one-third by 2100, while the cooling demand is expected to rise by more than 70% during the same period [12]. This highlights the impact of various factors such as heat and cold on electricity usage. Therefore, it is crucial to examine meteorological elements, including but not limited to temperature, humidity, and precipitation. Investigating these factors is essential to understanding their influence on electricity consumption. By examining various factors influencing electricity usage, predictive models can be developed to forecast future demand. This understanding supports the better planning and management of energy, leading to more efficient and sustainable power distribution. Proedrou [10] recommends an urgent understanding of how residential electricity usage is modeled to avoid future demand problems. Constructing a predictive model serves various purposes, such as boosting the economy, managing energy smartly, and understanding consumption patterns [13].
Considering hot arid climates, looking at Kuwait as an example, it ranks third among the top ten countries globally in electricity consumption per capita (kWh/capita) [14]. Although Kuwait has a smaller population compared to neighboring countries, its high electricity usage can be explained by factors like high income, a strong GDP, and government support for electricity prices. By comparing electricity prices between Kuwait and Gulf Cooperation Council (GCC) countries, the numbers indicate that Kuwait has the cheapest electricity tariff [15]. In addition, Alawadhi et al. [16] found that the type of housing such as apartments or villas and household size per population have the biggest impact on electricity consumption in Kuwait. The study showed that 65% of the country’s gross electricity usage is consumed from the residential sector, mostly due to HVAC systems. Also, it indicates that more than half of the homes in Kuwait are not entirely energy efficient; for instance, only 16.5% of households have energy-efficient HVAC systems, primarily air conditioning. This fact encourages examining meteorological factors for a better understanding of electricity usage patterns. Bunn [17] showed that meteorological parameters are crucial keys in electrical load predictions. Wang et al. [18] and Rajbhandari et al. [19] found that meteorological parameters are the main reasons for high electricity consumption. Their findings align with those of Asumadu Sarkodie et al. [20], who observed that electricity consumption increases with both temperature and humidity. Even though recent studies have started to use meteorological parameters more, researchers have been slow to fully depend on them. This is due to two main reasons: the inaccuracy of weather forecasts and data, and the delay in receiving the data [21]. Nevertheless, Wang et al. [18] and Holtedahl and Joutz [22] suggest that including meteorological factors is crucial for predicting electricity consumption. Hinman and Hickey [23] and Dordonnat et al. [24] emphasize the importance of considering the relationship between temperature and electricity consumption in forecasting. Goeb et al. [21] also mentioned that adding the temperature to meteorological factors improves the accuracy of electricity predictions. Once more, Wang et al. [18] highlight the importance of using temperature and humidity–temperature index as key meteorological factors in electricity consumption prediction.
Moreover, given the significant role of machine learning in current prediction research, several studies have employed machine learning algorithms to forecast electricity consumption. Reddy et al. [25] investigated various machine learning methods, such as K-nearest neighbors, XGBoost, random forest, and artificial neural networks, for predicting electricity usage. The results indicate that the proposed approaches can accurately predict the power usage. The K-nearest neighbors model performed the best, achieving a high accuracy rate of 90.92%. Also, in another study that used lagged historical information, it has been found that the utilization of linear regression and support vector regression yielded a success rate of 85.7% [26].
This study addresses a critical gap in the existing research, which often lacks the use of modern methods, on electricity consumption in hot arid regions by developing a more accurate prediction model that incorporates meteorological factors and accounts for the long-term effects. Despite numerous studies, there remains a lack of comprehensive models that integrate climate variables effectively to predict electricity needs in arid environments like Kuwait. By using both statistical methods and advanced machine learning techniques, this research enhances the accuracy and relevance of electricity consumption forecasts. Also, another objective is to examine the performance of certain machine learning algorithms, given their current prominence. Historical demand records and meteorological data are utilized to predict future consumption, offering a robust assessment of the real-world application of the findings. This approach not only refines prediction methodologies but also provides actionable insights for sustainable energy management in arid climates.

2. Methodology

2.1. Study Area and Data

Kuwait has been selected as a study area for this study. It is located on the northwest corner of the Arabian Gulf, as shown in Figure 1, which has an area of 17.8 thousand square kilometers. Kuwait lies in an arid climate zone known for its severe heat, droughts, and scarcity of natural resources. During summer of 2021, it experienced a very high temperature of 53.5 °C, which is considered one of the highest temperatures ever recorded worldwide [27]. The rainy season typically spans from October to May, while the hot and dry summer starts from June to September [28].
Despite its relatively small geographical size, Kuwait operates nine power stations, eight of which utilize conventional energy sources, while one is dedicated to renewable energy. Detailed electricity consumption data were collected from the Ministry of Electricity, Water, and Renewable Energy (MEW) over a five-year span from 2017 to 2021. Additionally, comprehensive meteorological parameters such as relative humidity (max, min, avg), air temperature (max, min, avg), dew point temperature (max, min, avg), wind speed, wind direction, and rainfall were systematically collected from the Directorate General of Civil Aviation (DGCA). This period corresponds with the electricity data collection timeframe. The DGCA manages a network of twenty-seven automated meteorological stations strategically distributed across Kuwaiti land and maritime territories. Given Kuwait’s relatively small size, the Kuwait International Airport (KIA) station is considered sufficient to represent the country’s meteorological conditions for this study.

2.2. Data Processing

The data required processing and cleaning before advancing to the analysis stage. This step focused particularly on the meteorological factors due to the gaps resulting from missing days. Out of the total 1826 observations between 2017 and 2021, data for 27 days were missing, which is less than 2% of the total records. Due to the small percentage, the study advanced to the next step. Then, a preprocessing step was implemented to eliminate the outliers and anomalies. This was conducted using Tukey’s method, which is a boxplot that simplifies data by graphically representing key values like the median, lower quartile (Q1), upper quartile (Q3), and the highest and lowest continuous univariate data [29]. Moreover, quartiles are the values that divide a set of integers into four equal parts. The distance between the first quartile (Q1) and third quartile (Q3) in a boxplot is the interquartile range (IQR). Additionally, one and a half interquartile ranges below Q1 and above Q3 are defined as the inner border, while three IQRs under Q1 and above Q3 are defined as the outer border [30]. An outlier can be marked as either a possible or a probable outlier. If the value falls between the inner and outer borders, it is considered a possible outlier. If the value falls outside the outer border, it is considered a probable outlier. Tukey’s method involves discarding and eliminating only the probable outliers. Additionally, the linear relationship between two variables can be quantified using Pearson’s correlation coefficient [31]. A straight line is fitted through the data points to represent the most suitable linear relationship, with the correlation coefficient indicating how closely the points align with this line.
Hierarchical clustering analysis (HCA) is a method used to group similar objects or features into clusters based on their characteristics. Also, it is described as clustering within clusters, aims to create a dendrogram, also known as a tree diagram. This method is utilized to discover and illustrate monthly patterns. In a tree diagram, the point where two nodes merge, termed the “joining together” node, indicates their similarity level; nodes that merge sooner are more similar than those merging later [32]. Following Gere [33]’s recommendation, Ward’s method is employed here due to its reliability and appropriateness for analysis. Additionally, the standardization process was conducted using the z-score technique. Hierarchical clustering analysis serves as an additional method to analyze electricity consumption, incorporating the meteorological factors to enhance our understanding of the parameters involved.

2.3. Modeling

2.3.1. Multiple Linear Regression (MLR) Forecasting Model

After identifying the most correlated meteorological factors with electricity consumption in the Pearson correlation test and exploring their effect on electricity consumption using linear regression, these factors are utilized to establish the best-fit model. MLR was selected for its straightforward implementation and its mechanism involves considering 16 meteorological factors (independent variables) to determine which ones exhibit the strongest correlations with electricity consumption (dependent variable) while discarding other uncorrelated factors. This process eliminates unsuitable variables and narrows down the remaining candidates to construct a high-quality regression model. Regression is commonly employed by researchers to compare and analyze the connection between two continuous variables. Thus, Musarat et al. [34] suggested adopting these statistical techniques in time series modeling as this aids in clarifying the causal connections between variables. This process involves several attempts: the first one includes all 16 meteorological variables, the second incorporates the most correlated variables obtained from the correlation step (7 meteorological variables), and finally, the third attempt includes only the meteorological variable with the strongest correlation among the 16 variables.

2.3.2. Multiplicative Time Series (MTS) Forecasting Model

Multiplicative time series (MTS) model is a statistical approach used to analyze and forecast data by multiplying the series’ components together. Electricity records in this study have a seasonal pattern, and MTS is well known for effectively capturing regular patterns and trends. Thus, the electricity consumption time series (Yt) can be broken into three independent parameters: seasonal (St), irregular (It), and trend (Tt), and the three parameters are expressed in the same time series units (MW.h, t) [35]. For this study, the input was monthly electric consumption records from 2017 to the end of 2021, which corresponds to time (t) ranging from 1 to 60 months. The forecast period extends from 2022 to 2026, with projected time ( t ˆ ) spanning from 61 to 120 months. Additionally, the data reveal a seasonal repeating pattern every 12 months. The equation of the multiplicative time series model can be expressed as:
Y t = S t I t T t
where Yt is the average electricity consumption at time t; St is the seasonal parameter; It is the irregularity parameter; and Tt is the trend parameter.

2.3.3. Extreme Gradient Boosting (XGBoost)

Extreme Gradient Boosting (XGBoost) is a robust supervised learning machine learning algorithm frequently used in tasks like regression. It is built on the principles of gradient boosting, which can be simplified as training different models sequentially, and at each new iteration, a new model is corrected based on the errors of the previous one [36]. This makes XGBoost effective when dealing with complex datasets that have nonlinear relationships. It is an efficient version of the Gradient Boosted Trees algorithm, which is a method that fine-tunes functions by improving specific loss functions and using different regularization techniques [37]. The functionality of XGBoost can be summarized into simple steps [38]. Initially, the predicted value is initialized, and the gradient of the loss function is calculated. In each training iteration, a new decision tree is added to the ensemble, its prediction is determined f k ( x ) , and the structure of the tree is optimized to minimize the objective function (Obj) which can be defined as
Obj = i = 1 n L y i , y i ^ + k = 1 K Ω f k
where y i is the target value for the ith sample, y ^ i is the predicted value for the ith sample, L( y i , y ^ i ) is the loss function, and Ω( f k ) represents the regularization terms. This process is repeated until a predefined number of trees (iterations) is reached or until the objective function converges. Once all trees have been added to the ensemble, the final prediction (ŷ) is obtained as
y i ^ = k = 1 T f k x i = y i ^ t 1 + f t x i
where y ^ i ( t 1 ) is the predicted value from the previous iteration and f t ( x i ) is new added prediction function.
XGBoost also includes several features that enhance its utility and flexibility, such as handling missing data, supporting various objective functions, and providing several parameters that can be tuned for specific datasets and problems. These features make XGBoost a highly adaptable tool for machine learning tasks across different domains, allowing it to deliver a superior performance compared to many traditional models. Additionally, XGBoost supports built-in cross-validation and leverages parallel processing to enhance computational efficiency, making it ideal for handling large datasets. Its ability to execute multiple tree-building processes simultaneously allows for more rapid model development and improved accuracy, which is crucial when dealing with complex or voluminous data. The flexibility to adjust various parameters helps practitioners fine-tune the algorithm to achieve the best possible results for any given challenge.

2.4. Performance Analysis

To evaluate the models on the same scale, the dataset was divided into training and testing sets, allocating the final year for testing and the earlier years for training. The performance of the electricity consumption prediction models was evaluated using the root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination ( R 2 ), which are calculated using the following formulas:
R M S E = i = 1 n ( y i y i ^ ) 2 n
M A P E = 1 n i = 1 n | y i y ^ i | y i
R 2 = 1 e i 2 ( y i y i ^ ) 2
where y is the original sea level record, ŷ is the predicted sea level value, n is the number of data points, and e i 2 is squared residuals.

3. Results and Discussion

The electricity demand in Kuwait is growing annually due to several factors that are causing a continuous challenge for the country. Kuwait’s electricity consumption is significantly affected by a variety of factors including an increasing population, economic growth, the abundant availability of fossil fuels, the rising prices of oil, extremely harsh weather conditions, the government’s subsidization of electricity (Tariff), and the low efficiency of power plants [39,40,41]. These elements play a crucial role in the variations in consumption that occur on an annual basis. It has been observed that, over the last 30 years (1992–2021), electricity consumption in Kuwait has increased by approximately 407%. Additionally, it was noted that, even though there was a slight decrease in population in the year 2000 by about 2%, the electricity consumption paradoxically increased by 2%. In 2020, a minor decrease in electricity consumption by 0.42% was recorded, which can likely be attributed to the reduced economic and social activities during the COVID-19 pandemic [42]. However, in 2021, despite another substantial drop in population by 6.1%, the electricity consumption registered a significant increase by 8.6% [42]. These fluctuations highlight the complexity of the factors influencing electricity demand and highlight the necessity of considering a broader range of variables beyond just population and societal growth. Understanding these dynamics is essential for developing effective energy policies and planning for the future energy needs in Kuwait.
During the examination of the collected data, outliers had to be eliminated as part of the data cleaning procedure before proceeding with further analysis. Utilizing Tukey’s method, it was determined that there were no outliers in the electricity consumption records, as shown in Figure 2. However, some outliers were detected in certain meteorological variables such as wind speed, maximum wind speed, minimum relative humidity, and dewpoint temperatures. Using Pearson correlation analysis, the results revealed a positive correlation between electricity consumption and factors such as air temperatures and dewpoint depression, and a negative correlation with relative humidity. Other parameters either showed no significant correlation or had a minimal influence on electricity usage. These findings are notably consistent with those reported by Kazeem et al. [43], who observed that the average air temperature has the most significant impact on increasing the electricity consumption. However, this observation contrasts with Fahad and Arbab [44]’s suggestion that on, windy summer days, the use of electricity tends to decrease due to the cooler air brought by the wind. Furthermore, Yang et al. [45] asserts that temperature is the primary factor affecting electricity use, aligning with this research findings that higher temperatures generally lead to increased electricity use. Similarly, Paravan et al. [46] and Chen et al. [47] show that warmer weather in summer leads to increased electricity use, while colder conditions in winter necessitate more heating and therefore more electricity consumption.
Using HCA for average air temperature divides the year into two main groups based on temperature variations. The first group spans from October to May, characterized by relatively low temperatures, and is further divided into two subgroups: November to February representing the core winter months, and March to May along with October, marking transitional periods. The second group, from June to September, experiences extremely high temperatures. These temperature fluctuations significantly correlate with electricity consumption, with higher temperatures leading to increased consumption, particularly evident during the summer months. Additionally, relative humidity shows a similar clustering pattern, subdivided into seasons, with a negative correlation with temperature: higher temperatures result in lower humidity and increased electricity consumption, while cooler months show the opposite trend. Dewpoint depression also exhibits a comparable clustering with seasonal changes affecting electricity consumption, highlighting the interplay between meteorological factors and electricity usage. Consistently with the methodology of this study, which simplified Kuwait’s seasons into two categories, Alkhalidi et al. [48] identified three clusters but concluded with two due to overlaps. Similarly, the HCA results of this research align well with Alsulaili et al. [49] findings, further validating this study approach.
Using MLR, in the initial trial, all 16 variables were examined, but only 10 were included in the final model and the variables are average air temperature, maximum wind speed, wind direction, maximum wind direction, maximum Dewpoint temperature, minimum dewpoint temperature, average relative humidity, maximum relative humidity, minimum relative humidity, and wind speed. It showed an R-squared value of 0.932. However, high VIF values indicate multicollinearity, making the model unacceptable. In the second attempt, only the seven most correlated variables were used, resulting in a model with an R 2 value of 0.96. However, multicollinearity issues led to its rejection. For the third attempt, only the most influential variable, average air temperature, was utilized. Table 1 provides a summary of all MLR results. The last MLR model met optimal criteria with an R 2 value of 0.914 and a VIF value of 1, indicating no multiple correlation issues, as shown in Figure 3. These R 2 values align closely with those reported in predictive models from other researchers, as seen in the study by Almuhaini and Sultana [50]. This correspondence suggests that the modeling approaches employed in our analysis are robust and align well with other methods in the field. Furthermore, it validates the methodologies used for forecasting electricity consumption in arid regions, offering a reliable benchmark for more investigations.
Additionally, leveraging the insights gained from a detailed Pearson correlation analysis, an XGBoost model was meticulously developed to enhance the forecasting of electricity consumption. This advanced predictive model incorporated several critical meteorological inputs, such as average air temperature, average relative humidity, and Dewpoint depression, which were identified as the most correlated parameters affecting electricity demand. The performance of the XGBoost model was evaluated against the traditional MLR model, and it demonstrated substantial improvements in several key statistical metrics. Specifically, it outperformed the MLR model in RMSE, MAPE, and ( R 2 ), as highlighted in Figure 4. This figure clearly shows how the prediction values align better with the actual records and have less noise compared to the MLR results.
While these improvements in error metrics are promising and represent a significant advancement in predictive modeling, they do not necessarily transform the regression XGBoost model into a definitive tool for accurate and reliable forecasting. The enhancements primarily indicate that meteorological factors have a significant impact on electricity consumption patterns. However, the reliance on these parameters introduces specific challenges, especially in regions like Kuwait, where obtaining consistent and accurate meteorological data can be particularly challenging due to the high number of missing or uncertain data [51]. The variability in such data can significantly affect the reliability of forecasts.
Moreover, the XGBoost and MLR models did not reveal any significant upward trends in electricity consumption, which limits their utility for long-term energy planning and forecasting future increases in demand. This is a critical limitation, as the accurate predictions of long-term trends are essential for effective energy management and infrastructure development [52,53].
To address these issues and enhance the robustness of our forecasting approach, the study then shifted focus towards time series analysis. Time series methods, unlike the previously used regression models, can integrate historical consumption data more effectively and are less dependent on external inputs such as meteorological factors. This shift aimed to develop more precise and reliable electricity load forecasts by capturing underlying patterns and trends in energy usage over time. The exploration of time series analysis is expected to offer a better framework for predicting future electricity consumption trends, taking into account both seasonal variations and long-term shifts in demand patterns.
MTS forecasting model was implemented using monthly electricity consumption data from 2017 to 2021 to forecast future demand for the period from 2022 to 2026. This model explicitly treated the underlying trend as a 12-month cycle, effectively capturing the seasonal nature of electricity consumption. The seasonal ( S t ) and irregularity ( I t ) components of the model were calculated by averaging the values for each month, providing a clear, detailed breakdown of consumption patterns. Values exceeding the baseline of 1.0 were consistently observed during the high-demand months of May, June, July, August, September, and October, reflecting increased electricity usage during these warmer months.
The two initial attempts to refine the MTS model faced challenges; however, the breakthrough came on the third attempt, which successfully captured a noticeable upward trend in the predicted values. This trend indicates an annual increase in electricity consumption, corroborating the trends observed in real-world energy usage data [54]. The success of this third iteration was quantified by significantly low MAPE and RMSE values, reinforcing the model’s excellent performance and its robustness in forecasting electricity demand. Moreover, the model’s ability to effectively forecast future consumption patterns was visually represented in Figure 5, which detailed the predictive outcomes and validated the model’s accuracy and relevance in the context of actual consumption trends.
These findings demonstrate that the MTS model yielded adequate forecasts of future electricity consumption. However, the effectiveness of the MTS model necessitated the transformation of daily data into monthly aggregates to better capture and maintain the inherent trend and seasonality patterns essential for reliable forecasting. This aggregation, while useful for identifying broader trends, risked losing critical daily details, which are crucial for developing day-to-day predictive models.
In response to these limitations and to better leverage the detailed information contained in daily records, an XGBoost time series model was subsequently developed. This innovative model adeptly utilized the noisy daily records, incorporating a 3-day lag as input to preserve and thoroughly analyze the finer temporal dynamics. Remarkably, this XGBoost model not only succeeded in maintaining the seasonal patterns observed with the aggregated monthly data, but it also produced highly accurate results when evaluated against the test set, as illustrated in Figure 6.
Moreover, it scored lower than all previous models in terms of error metrics and relied solely on historical electricity consumption data. The model also projected electricity consumption up to the year 2026, providing daily predicted records. These predictions offer detailed insights into expected usage patterns and potential peak demand periods throughout the forecast period, making the data exceptionally valuable for operational planning and demand management. This level of detail is crucial for identifying specific days that may require additional energy supply or demand response strategies. Given its comprehensive coverage and low dependency on extraneous variables, this suggests that the XGBoost model might be the best option for real-world applications, as it effectively predicts future consumption using only historical consumption data without the need for additional parameters.
When the actual and forecasted records are transformed into monthly data, it becomes apparent that the rising trend is not as steep as the one predicted by the MTS model. This discrepancy may be attributed to the XGBoost model benefiting from training on a more extensive dataset compared to the MTS model, which relied only on monthly data to identify critical parameters. This conversion into monthly data might result in the loss of crucial details that affect trend accuracy. The final model’s MAPE value on the test set is 0.014%, significantly lower compared to the MAPE value reported in the study by Rathore et al. [55]. Both values are remarkably small, suggesting high accuracy. The probable reason for the smaller MAPE value in the final model could be that the XGBoost was trained on a broader range of data, unlike the limited two-month dataset used in Rathore et al. [55]’s study. Moreover, the XGBoost time-series model appears less conservative, leading to lower long-term consumption estimates, as shown in Figure 7. This observation can be seen when comparing the trend steepness to the MTS forecasted trend.
Table 2 provides a comparative overview of all models, highlighting their key performance metrics and settings. This comparison allows for an easy evaluation of each model’s effectiveness in predicting electricity consumption.
This study highlights a limitation in relying on trial-and-error methods for optimizing machine learning models, which can result in missing the optimal model structure. A more systematic approach, such as Grid Search, could significantly improve hyperparameter tuning and model accuracy [56]. Comparisons between machine learning models and traditional statistical models, which serve as baselines, show only minimal improvements in error metrics. Access to a larger dataset could lead to more significant improvements. Direct comparisons between machine learning models might be more beneficial than contrasting them with statistical models.
Ultimately, forecasts for the years 2022 to 2026 were compared among the MTS model, the XGBoost time series model, and the expected electricity production from MEW [42]. Both the MTS and XGBoost models support the notion that Kuwait’s electricity supply is secure, as their forecasts are consistently lower than the expectations set by MEW. This suggests that there is a buffer in the energy supply, potentially alleviating concerns over future energy shortages. Figure 8 illustrates a comparison between the two time series models from this study and the planned electrical energy production according to MEW’s projections. These findings highlight the robustness of Kuwait’s planning strategies and the reliability of advanced predictive models in ensuring energy security. However, it remains crucial for policy makers and energy authorities to continuously update their models and assumptions to adapt to fast-changing global energy landscapes and technological advancements. Furthermore, this comparison can serve as a foundational analysis for enhancing the accuracy of future forecasts and refining energy policy decisions. The findings also prompt a deeper investigation into the reasons behind the discrepancies between the models’ forecasts and MEW’s projections, which could provide insights into potential areas of improvement in energy consumption forecasting and strategic planning.

4. Conclusions

This study highlights the complex relationship between meteorological factors and electricity consumption in arid regions, with the models tested showing promising capabilities for predicting electricity consumption, crucial for strategic energy planning and sustainability initiatives. Accurate predictive models are crucial for strategic energy planning and sustainability initiatives, as they enable better demand forecasting, efficient resource allocation, and reduced operational costs. Additionally, these models can aid in optimizing the integration of renewable energy sources, enhancing grid stability, and supporting policy-making aimed at mitigating the impact of climate change on energy systems. Accurate predictive models are crucial for strategic energy planning and sustainability initiatives, as they enable better demand forecasting, efficient resource allocation, and reduced operational costs. Additionally, these models can aid in optimizing the integration of renewable energy sources, enhancing grid stability, and supporting the policy making aimed at mitigating the impact of climate change on energy systems.
Although relying on trial-and-error methods for optimizing machine learning models presents a limitation, adopting systematic approaches could significantly improve the model accuracy. The forecasted electricity consumption values can serve as a foundation for the effective management and efficient planning of resources, ensuring reliable supplies and reduced costs and emissions. This research is expected to inspire further investigations into how meteorological factors affect electricity consumption in arid regions, particularly in areas where advanced methods are not utilized. Future studies could explore additional factors that impact consumption trends such as solar radiation, or refine the models used in this research by incorporating ensemble learning techniques, convolutional neural networks, and long short-term memory (LSTM) networks. Researchers could also examine the seasonal variability of meteorological influences and their interaction with socio-economic factors, or apply these models to different geographical contexts with similar climatic conditions, such as semi-arid regions. These improvements could lead to more advanced forecasting tools, specifically designed for the unique environmental and climatic characteristics of arid areas, enhancing their effectiveness and dependability.

Author Contributions

Conceptualization: A.A.; Investigation: N.A. (Nasser Alenezi); Methodology: A.A., N.A. (Noor AboRamyah), N.A. (Nasser Alenezi), and M.A.; Project administration: A.A.; Software: N.A. (Noor AboRamyah) and N.A. (Nasser Alenezi); Supervision: A.A. and M.A.; Validation: M.A.; Visualization: N.A. (Nasser Alenezi); Writing—original draft: A.A. and N.A. (Noor AboRamyah); Writing—review and editing: N.A. (Nasser Alenezi) and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. IEA. Global Energy & CO2 Status Report 2019; Technical Report; International Energy Agency: Paris, France, 2019. [Google Scholar]
  2. IEA. Electricity—Energy System; Technical Report; International Energy Agency: Paris, France, 2022. [Google Scholar]
  3. Jabir, H.J.; Teh, J.; Ishak, D.; Abunima, H. Impacts of Demand-Side Management on Electrical Power Systems: A Review. Energies 2018, 11, 1050. [Google Scholar] [CrossRef]
  4. Karimulla, S.; Ravi, K. A Review on Importance of Smart Grid in Electrical Power System. In Proceedings of the 2019 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC), Melmaruvathur, India, 27–28 March 2019; pp. 22–27. [Google Scholar] [CrossRef]
  5. Hasanuzzaman, M.; Rahman, S.; Masjuki, H. Effects of different variables on moisture transfer of household refrigerator-freezer. Energy Educ. Sci. Technol. Part Energy Sci. Res. 2011, 27, 401–418. [Google Scholar]
  6. Saidur, R.; Hasanuzzaman, M.; Sattar, M.; Masjuki, H.; Anjum, M.; Mohiuddin, A. An analysis of energy use, energy intensity and emissions at the industrial sector of Malaysia. Int. J. Mech. Mater. Eng. 2007, 2, 84–92. [Google Scholar]
  7. Hasanuzzaman, M.; Abd Rahim, N.; M, H.; Rahman, S.; Islam, M.M.; Rashid, M. Energy savings in the combustion based process heating in industrial sector. Renew. Sustain. Energy Rev. 2012, 16, 4527–4536. [Google Scholar] [CrossRef]
  8. Enerdata. World Power Consumption—Electricity Consumption; Enerdata: Grenoble, France, 2022. [Google Scholar]
  9. Nti, I.; Teimeh, M.; Adekoya, A.; Nyarko-Boateng, O. Forecasting Electricity Consumption of Residential Users Based on Lifestyle Data Using Artificial Neural Networks. Ictact J. Soft Comput. 2020, 10, 2107–2116. [Google Scholar]
  10. Proedrou, E. A Comprehensive Review of Residential Electricity Load Profile Models. IEEE Access 2021, 9, 12114–12133. [Google Scholar] [CrossRef]
  11. Krarti, M. Evaluation of Energy Efficiency Potential for the Building Sector in the Arab Region. Energies 2019, 12, 4279. [Google Scholar] [CrossRef]
  12. Isaac, M.; Vuuren, D. Modeling global residential sector energy demand for heating and air conditioning in the context of climate change. Energy Policy 2009, 37, 507–521. [Google Scholar] [CrossRef]
  13. Demirel, O.; Kakilli, A.; Tektas, M. Electric energy load forecasting using anfis and arma methods. J. Fac. Eng. Archit. Gazi Univ. 2010, 25, 601–610. [Google Scholar]
  14. Indexmundi. Electricity Consumption per Capita. 2020. Available online: https://www.indexmundi.com/map/?v=81000 (accessed on 3 June 2024).
  15. Sim, L.C. Renewable Energy and Governance Resilience in the Gulf. Energies 2023, 16, 3225. [Google Scholar] [CrossRef]
  16. Alawadhi, A.; Burney, N.; Gelan, A.; Al-Fulaij, S.; Al-Musallam, N.; Awadh, W. The effect of conservation on residential electricity consumption: Evidence from Kuwait. Int. Rev. Appl. Econ. 2022, 36, 589–607. [Google Scholar] [CrossRef]
  17. Bunn, D.W. Short-Term Forecasting: A Review of Procedures in the Electricity Supply Industry. J. Oper. Res. Soc. 1982, 33, 533–545. [Google Scholar] [CrossRef]
  18. Wang, R.; Yao, X.; Li, C.; Hu, B.; Xie, K.; Niu, T.; Li, M.; Fu, J.; Sun, Q. Combination Forecasting Model of Daily Electricity Consumption in Summer Based on Daily Characteristic Meteorological Factors. IOP Conf. Ser. Mater. Sci. Eng. 2020, 853, 012024. [Google Scholar] [CrossRef]
  19. Rajbhandari, Y.; Marahatta, A.; Ghimire, B.; Shrestha, A.; Gachhadar, A.; Thapa, A.; Chapagain, K.; Korba, P. Impact Study of Temperature on the Time Series Electricity Demand of Urban Nepal for Short-Term Load Forecasting. Appl. Syst. Innov. 2021, 4, 43. [Google Scholar] [CrossRef]
  20. Asumadu Sarkodie, S.; Ahmed, M.; Owusu, P.A. Ambient air pollution and meteorological factors escalate electricity consumption. Sci. Total. Environ. 2021, 795, 148841. [Google Scholar] [CrossRef] [PubMed]
  21. Goeb, R.; Lurz, K.; Pievatolo, A. Electrical Load Forecasting by Exponential Smoothing with Covariates. Appl. Stoch. Model. Bus. Ind. 2013, 29, 629–645. [Google Scholar] [CrossRef]
  22. Holtedahl, P.; Joutz, F. Residential electricity demand in Taiwan. Energy Econ. 2004, 26, 201–224. [Google Scholar] [CrossRef]
  23. Hinman, J.; Hickey, E. Modeling and Forecasting Short-Term Electricity Load Using Regression Analysis; Illinois State University: Normal, IL, USA, 2009. [Google Scholar]
  24. Dordonnat, V.; Koopman, S.J.; Ooms, M.; Dessertaine, A.; Collet, J. An Hourly Periodic State Space Model for Modelling French National Electricity Load. Int. J. Forecast. 2007, 24, 566–587. [Google Scholar] [CrossRef]
  25. Reddy, G.; Aitha, L.; Poojitha, C.; Shreya, A.; Reddy, D.; Meghana, G. Electricity Consumption Prediction Using Machine Learning. E3S Web Conf. 2023, 391, 01048. [Google Scholar] [CrossRef]
  26. Hernández, G.; González-Briones, A.; Chamoso, P.; Casado-Vara, R.; Prieto, J.; Venyagamoorthy, K.; Corchado, J. Review of the state of the art of machine models for household consumption prediction. In Proceedings of the DREAM-GO, Porto, Portugal, 16–17 January 2019; pp. 31–36. [Google Scholar] [CrossRef]
  27. DGCA. Kuwait Meteorological Center—Climate History; Kuwait Meteorological Department: Kuwait City, Kuwait, 2022. [Google Scholar]
  28. Aldaithan, Z.; Almethen, O. The Variation of Rain Fall in Kuwiat from 1962 Till 2010. Int. J. Eng. Sci. 2017, 6, 32–39. [Google Scholar] [CrossRef]
  29. Seo, S. A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets. Available online: https://www.semanticscholar.org/paper/A-Review-and-Comparison-of-Methods-for-Detecting-in-Seo/cb868f0b242b9623b7544a58b6a21647dfa138a5 (accessed on 12 January 2024).
  30. Salgado, C.M.; Azevedo, C.; Proença, H.; Vieira, S.M. Noise Versus Outliers. In Secondary Analysis of Electronic Health Records; Springer International Publishing: Cham, Switzerland, 2016; pp. 163–183. [Google Scholar] [CrossRef]
  31. Ly, A.; Marsman, M.; Wagenmakers, E.J. Analytic Posteriors for Pearson’s Correlation Coefficient. Stat. Neerl. 2015, 72, 4–13. [Google Scholar] [CrossRef] [PubMed]
  32. Tullis, T.; Albert, B. Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics; Elsevier: Amsterdam, The Netherlands, 2013. [Google Scholar] [CrossRef]
  33. Gere, A. Recommendations for validating hierarchical clustering in consumer sensory projects. Curr. Res. Food Sci. 2023, 6, 100522. [Google Scholar] [CrossRef] [PubMed]
  34. Musarat, M.A.; Alaloul, W.; Rabbani, M.; Ali, M.; Altaf, M.; Fediuk, R.; Vatin, N.; Klyuev, S.; Bukhari, H.; Sadiq, A.; et al. Kabul River Flow Prediction Using Automated ARIMA Forecasting: A Machine Learning Approach. Sustainability 2021, 13, 10720. [Google Scholar] [CrossRef]
  35. ABS. Time Series Analysis: The Basics; Australian Bureau of Statistics: Canberra, Australia, 2005. [Google Scholar]
  36. Natekin, A.; Knoll, A. Gradient Boosting Machines, A Tutorial. Front. Neurorobotics 2013, 7, 21. [Google Scholar] [CrossRef] [PubMed]
  37. Ali, Z.; Abduljabbar, Z.; Tahir, H.; Sallow, A.; Almufti, S. Exploring the Power of eXtreme Gradient Boosting Algorithm in Machine Learning: A Review. Acad. J. Nawroz Univ. (AJNU) 2023, 12, 320–334. [Google Scholar] [CrossRef]
  38. Zhang, P.; Jia, Y.; Shang, Y. Research and application of XGBoost in imbalanced data. Int. J. Distrib. Sens. Netw. 2022, 18, 15501329221106935. [Google Scholar] [CrossRef]
  39. Ali, H.; Alsabbagh, M. Residential Electricity Consumption in the State of Kuwait. Environ. Pollut. Clim. Change 2018, 2, 153. [Google Scholar] [CrossRef]
  40. Alrashidi, M.; El-Naggar, K. Long term electric load forecasting based on particle swarm optimization. Appl. Energy 2010, 87, 320–326. [Google Scholar] [CrossRef]
  41. Alotaibi, S. Energy consumption in Kuwait: Prospects and future approaches. Energy Policy 2011, 39, 637–643. [Google Scholar] [CrossRef]
  42. MEW. Statisticl Yearly Book—Electrical Energy; Technical Report; Ministry of Electricity & Water & Renewable Energy: Kuwait City, Kuwait, 2021. [Google Scholar]
  43. Kazeem, R.; Amakor, J.; Ikumapayi, O.; Afolalu, A.; Oke, W. Modelling the Effect of Temperature on Power Generation at a Nigerian Agricultural Institute. Math. Model. Eng. Probl. 2022, 9, 645–654. [Google Scholar] [CrossRef]
  44. Fahad, M.; Arbab, N. Factor Affecting Short Term Load Forecasting. J. Clean Energy Technol. 2014, 2, 305–309. [Google Scholar] [CrossRef]
  45. Yang, T.; Ren, M.; Zhou, K. Identifying household electricity consumption patterns: A case study of Kunshan, China. Renew. Sustain. Energy Rev. 2018, 91, 861–868. [Google Scholar] [CrossRef]
  46. Paravan, D.; Debs, A.; Hansen, C.; Becker, D.; Hirsch, P.; Golob, R. Influence of temperature on short-term load forecasting using the EPRI-ANNSTLF. In Proceedings of the Second Balkan Electricity Conference, Varna, Bulgaria, 28 June 2002; pp. 19–24. [Google Scholar]
  47. Chen, B.J.; Chang, M.W.; Lin, C.J. Load forecasting using support vector machines: A study on EUNITE Competition 2001. Power Syst. IEEE Trans. 2004, 19, 1821–1830. [Google Scholar] [CrossRef]
  48. Alkhalidi, M.; Alsulaili, A.; Almarshed, B.; Bouresly, M.; Alshawish, S. Assessment of Seasonal and Spatial Variations of Coastal Water Quality Using Multivariate Statistical Techniques. J. Mar. Sci. Eng. 2021, 9, 1292. [Google Scholar] [CrossRef]
  49. Alsulaili, A.; Alkandari, M.; Buqammaz, A. Assessing the impacts of meteorological factors on freshwater consumption in arid regions and forecasting the freshwater demand. Environ. Technol. Innov. 2022, 25, 102099. [Google Scholar] [CrossRef]
  50. Almuhaini, S.H.; Sultana, N. Forecasting Long-Term Electricity Consumption in Saudi Arabia Based on Statistical and Machine Learning Algorithms to Enhance Electric Power Supply Management. Energies 2023, 16, 2035. [Google Scholar] [CrossRef]
  51. Al-Nassar, W.; Alhajraf, S.; Al-Enizi, A.; Al-Awadhi, J. Potential wind power generation in the State of Kuwait. Renew. Energy 2005, 30, 2149–2161. [Google Scholar] [CrossRef]
  52. Tang, L.; Wang, X.; Wang, X.; Shao, C.; Liu, S.; Tian, S. Long-term Electricity Consumption Forecasting Based on Expert Prediction and Fuzzy Bayesian Theory. Energy 2018, 167, 1144–1154. [Google Scholar] [CrossRef]
  53. Ghods, L.; Kalantar, M. Different Methods of Long-Term Electric Load Demand Forecasting a Comprehensive Review. Iran. J. Electr. Electron. Eng. 2011, 7, 249–259. [Google Scholar]
  54. Galeshi, S. ‘Cluster’ Converters Based on Multi-Port Active-Bridge: Application to Smartgrids. Ph.D. Thesis, Université Grenoble Alpes, Grenoble, French, 2021. [Google Scholar]
  55. Rathore, H.; Meena, H.K.; Jain, P. Prediction of EV Energy consumption Using Random Forest and XGBoost. In Proceedings of the 2023 International Conference on Power Electronics and Energy (ICPEE), Bhubaneswar, India, 3–5 January 2023; pp. 1–6. [Google Scholar]
  56. Alenezi, N.; Alsulaili, A.; Alkhalidi, M. Prediction of Sea Level in the Arabian Gulf Using Artificial Neural Networks. J. Mar. Sci. Eng. 2023, 11, 2052. [Google Scholar] [CrossRef]
Figure 1. State of Kuwait map.
Figure 1. State of Kuwait map.
Sustainability 16 06326 g001
Figure 2. Electricity consumption box plot.
Figure 2. Electricity consumption box plot.
Sustainability 16 06326 g002
Figure 3. Results of MLR model.
Figure 3. Results of MLR model.
Sustainability 16 06326 g003
Figure 4. Results of XGboost using meteorological factors.
Figure 4. Results of XGboost using meteorological factors.
Sustainability 16 06326 g004
Figure 5. MTS forecasting beyond year 2021.
Figure 5. MTS forecasting beyond year 2021.
Sustainability 16 06326 g005
Figure 6. XGboost time-series model results.
Figure 6. XGboost time-series model results.
Sustainability 16 06326 g006
Figure 7. XGboost time-series forecasting beyond year 2021 (monthly records).
Figure 7. XGboost time-series forecasting beyond year 2021 (monthly records).
Sustainability 16 06326 g007
Figure 8. Comparison between MTS, XGBoost, and MEW.
Figure 8. Comparison between MTS, XGBoost, and MEW.
Sustainability 16 06326 g008
Table 1. Summary of multiple linear regression models.
Table 1. Summary of multiple linear regression models.
Input Independent VariablesUnstandardized Coefficientp-Value R 2 VIF
Trial 1Average air temperature6744.8570.0000.9323.830
Wind max speed−1368.1160.0001.559
Wind direction24.2030.0001.785
Wind max direction−14.4860.0008.222
Maximum dewpoint temperature630.9950.0015.329
Minimum dewpoint temperature−1136.9530.00035.90
Average relative humidity772.6300.0007
Maximum relative humidity−315.1150.00022.94
Minimum relative humidity−239.6550.0097
Wind speed−892.0250.04510.67
Trial 2Average air temperature6152.6990.0000.96137.933
Maximum air temperature1102.5040.00050.633
Minimum air temperature−935.3830.00135.241
Average relative humidity98.4310.0032.637
Trial 3Average air temperature6319.9030.0000.9141
VIF: variance inflation factor.
Table 2. Comparative overview of all models.
Table 2. Comparative overview of all models.
NoModelTypeIntervalInputsRMSEMAPE R 2
1MLRRegressionDailyAir temperature19,217.338.445 %0.914
2XGboostRegressionDailyAir temperature
Relative Humidity
Dewpoint depression
17,776.557.922 %0.927
3MTSTime-SeriesMonthlyAverage electricity consumption at certain time
Seasonal parameter
Irregularity parameter
7269.672.613 %-
4XGboostTime seriesDaily3 days lag8167.9753.09 %-
5XGboostTime seriesMonthlyAveraged from daily XGboost time-series model (No.4)
Note: Not a model
25.870.014%-
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alsulaili, A.; Aboramyah, N.; Alenezi, N.; Alkhalidi, M. Advancing Electricity Consumption Forecasts in Arid Climates through Machine Learning and Statistical Approaches. Sustainability 2024, 16, 6326. https://doi.org/10.3390/su16156326

AMA Style

Alsulaili A, Aboramyah N, Alenezi N, Alkhalidi M. Advancing Electricity Consumption Forecasts in Arid Climates through Machine Learning and Statistical Approaches. Sustainability. 2024; 16(15):6326. https://doi.org/10.3390/su16156326

Chicago/Turabian Style

Alsulaili, Abdalrahman, Noor Aboramyah, Nasser Alenezi, and Mohamad Alkhalidi. 2024. "Advancing Electricity Consumption Forecasts in Arid Climates through Machine Learning and Statistical Approaches" Sustainability 16, no. 15: 6326. https://doi.org/10.3390/su16156326

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop