1. Introduction
A smart grid is the future vision of power systems that will be enabled by artificial intelligence (AI), big data, and IoT, where digitalization is at the core of the energy sector transformation. The smart grid concept was introduced in the 2000s to address multiple issues, such as power quality, energy security, renewable integration, etc., through new investment in modern bidirectional communication infrastructure [
1]. In 2011, the Electric Power Research Institute (EPRI) referred to the smart grid as “a modernization of the electricity delivery system that can monitor, protect, and automatically optimize the operation of its interconnected elements”.
AI is another technology starting to positively impact the energy sector as enterprises change their attitude towards this technology. A recent survey conducted by Siemens found that energy companies are already transforming their operations using AI, where 30% responded that they are using AI for more intelligent automation of machinery equipment, while 28% are using it for asset maintenance forecasts. However, the study also found that many leaders are cautious about implementing AI, as they still have difficulty trusting it with important decisions [
2].
Energy managers are becoming more concerned about the reliability and security of power systems. In addition, the digital economy has imposed greater demand on the electricity supply’s reliability, with more consumers and electric vehicles (EVs) becoming connected to the electric grid. Therefore, the electricity industry is now on the verge of a new era faced with many challenges to meet higher security, interoperability, and reliability requirements. The growing challenges are rising electricity demand, peak demand growth, energy security [
3], lack of smart grid technology standards, EV accommodation [
4], and cybersecurity concerns.
Energy managers are struggling to find ways to reduce the gap between supply and demand. Therefore, energy planners use various methods and technologies to support the sustainable expansion of power systems, such as electricity demand forecasting models, stochastic optimization, robust optimization, and simulation. Electricity forecasting plays a vital role in supporting the reliable transitioning of power systems. For example, the authors in [
5] recently provided a global perspective on the importance of electricity forecasting and the state-of-the-art techniques to support rising electricity demand in low and middle-income countries, focusing mainly on Pakistan. However, challenges still exist in generating more accurate forecasts due to the granularity and quality of the data collected from sensors and Supervisory Control and Data Acquisition (SCADA) systems, the nonlinear and noisy patterns presented in the data, and the complex features that affect it.
Short-term load forecasting (STLF) has become an active area of research over the last few years, with a handful of studies. With the advent of the smart grid, there is a need for more accurate forecasting models to allow for better planning and operation of electricity providers to meet consumer demand reliably. STLF deals with predicting demand one hour to 24 h in advance. It can help support short-term decisions such as economic dispatch of power plants, fuel purchases, and electricity market trading while addressing the grid’s real-time control and security from massive power outages.
The ability to accurately forecast demand one hour to a day ahead can help energy suppliers anticipate how much power to generate to meet real-time consumer demand most reliably and cost-effectively possible. Underestimating demand can lead to power outages and unreliable grid operation, while overestimating demand can result in energy wastage. Therefore, an accurate forecast can result in better energy management and significant supplier cost savings. However, short-term load forecasting can be challenging because the load exhibits highly nonlinear patterns that are difficult to model. In addition, several factors can affect the load, such as weather, season, time of the day, consumer behavior, and other random factors.
Three main methods have been proposed in the literature to solve this problem, which are (1) traditional statistical-based models, (2) machine learning-based models, and (3) deep learning-based models. The most common statistical methods used in the literature are autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), exponential smoothing, linear regression, and the similar day approach. However, these methods are limited in learning the complex nonlinear interactions between the input and output variables. Therefore, they do not provide satisfactory results for such problems. On the other hand, machine learning methods can deal with the shortcomings of statistical-based models since they can model complex nonlinear mapping between inputs and outputs, learn hidden patterns in vast amounts of data, and offer scalability. Examples of machine learning methods widely used for STLF are Artificial Neural Networks (ANNs), decision tree regression, ensemble trees, random forest, support vector regression (SVR), and extreme learning machines (ELM).
Machine learning methods have been studied. The authors in [
6] compared four machine learning approaches for estimating electricity demand in Cyprus between 2016 and 2017 with short to long-term analysis. Population, economic, and weather variables were introduced into the model to forecast electricity demand for the region. The study concluded that ANN and support vector machine (SVM) methods were superior to multiple linear regression [
6]. Ref. [
7] proposed an improved machine learning framework based on SVM and ELM. For this study, the hyperparameters were tuned with the Grid Search method. The authors agreed on the fast training and accuracy that ELM provided.
Recursive Neural Networks (RNNs) have become more established in the STLF field, with Long Short-Term Memory (LSTM) receiving increased attention. The authors in [
8] proposed a new architecture based on RNN to forecast electricity demand for different time scales. The model was benchmarked against other established neural networks, i.e., backpropagation and LSTM. The results indicated that RNN had superior performance and was easier to train than LSTM. The authors in [
9] first proposed an improved hybrid model based on LSTM and ELM to learn deep and shallow electricity patterns. The hybrid model performed the best after being benchmarked against classical ELM, LSTM, and SVR. The authors in [
10] compared single versus deep-stacked LSTM neural networks with different activation functions to forecast electricity load one hour ahead, considering historical temperature and load data. The results demonstrated that the model with two stacked LSTM layers performed the best, with a MAPE of 1.53%. The researchers in [
11] investigated two deep learning methods, LSTM and the Gate Recurrent Unit (GRU), benchmarked with ANN and ensemble trees. Deep learning provided the most stable and accurate performance. The work in [
12] proposed an LSTM-based framework to predict short-term residential demand using open data from the Australia Smart Grid project.
LSTM was the most successful for individual and aggregated forecasts after being benchmarked against machine learning methods. A sequence-to-sequence (Seq2seq) architecture was investigated by [
13]. The model was benchmarked against RNN, LSTM, and GRU methods. The Seq2seq model had superior performance, with a MAPE of 5.20%. Bi-LSTM has also been widely investigated in the literature. The authors in [
14] proposed a novel Bi-LSTM network with an attention mechanism to predict load up to half an hour in advance. The proposed model performed better than the traditional Bi-LSTM, given that more weight is allocated to important information. The researchers in [
15] proposed a stacked Bi-LSTM method to forecast day and week residential consumption in Scotland, using historical demand and weather features. Bi-LSTM delivered high accuracy, with MAPE ranging between 1.66% and 2.22% for the maximum demand week. The work in [
16] found that deep networks based on Bi-LSTM did not improve performance. The authors in [
17] later compared the performance of LSTM with two machine learning methods to solve single and multi-step forecasting. LSTM had superior performance, especially during the summer. In [
18], the effectiveness of LSTM in delivering an accurate forecast of one hour and 24 h ahead was similarly demonstrated in Poland. The authors agreed that LSTM could support forecasting, especially for small power regions with irregular demand patterns.
Research has recently focused on Convolutional Neural Networks (CNN) for STLF. Ref. [
9] demonstrated that temporal CNN could effectively provide a reliable forecasting model compared to SVR and Long Short-Term Memory (LSTM) networks. The authors in [
19] proposed a novel Wavenet model that combines causal Convolutional Neural Networks (CNNs) and LSTM inspired by fine-tuning to support demand response programs. Although the model performed better than the benchmarked methods, the authors suggested considering weather and holiday indicators for future work. Researchers in [
20] developed a hybrid CNN-LSTM model with clustering analysis to predict Australia’s electricity consumption. A remark was made on the robustness of the model to outliers. Ref. [
21] recommended an integrated CNN-LSTM model to forecast the electricity load for Bangladesh. The CNN-LSTM model provided robust performance compared to LSTM, radial basis function network (RBFN), and Extreme Gradient Boosting (XGBoost). Finally, Ref. [
22] introduced a novel parallel LSTM-CNN network to address the STLF problem in Smart Grids, in which the CNN and LSTM were trained separately. The LSTM-CNN model proved to be a good candidate. Regression models, on the contrary, did not perform well. The authors in [
23] were among the first to explore 2D CNN for forecasting electricity demand, in which the data were processed using four channels. The model performed reasonably well on the test set and captured the trends, especially for holidays. Ref. [
24] proposed a novel feature extraction framework based on 2D CNNs, with Singapore as a case study. The authors agreed that the model provided high feature extraction and was superior to other methods, such as ResNet. The authors in [
24] recently suggested a model based on CNN-BiLSTM for Smart Grids at the customer level, using big datasets from Turkey. CNN performed better than the machine learning methods and handled the missing data.
Several authors are starting to investigate the impact of the COVID-19 pandemic on the performance of electricity forecasting models. The authors in [
25] evaluated LSTM to forecast electricity demand for the Australian Energy Market, given the impact of COVID-19. The data were analyzed from January 2019 to August 2020. This study revealed that LSTM was very effective at learning about the drastic changes in electricity patterns caused by the lockdown. On the other hand, the researchers in [
26] evaluated the performance of three models: ARIMA, traditional ARIMA, and ANN. The rolling ARIMA was the best model, obtaining a MAPE of 5.5% between March and May 2020. A remark was made on the ability of the model to perform well despite the high uncertainty caused by the pandemic. In [
27], a graph convolutional network based on representation learning was introduced to model the impact of various COVID-19-related features (i.e., mobility, the daily number of confirmed cases) on electricity demand in Houston, Texas. While the model was found to be robust, the authors found that the encoded features were not able to capture the effect of the pandemic fully.
This paper involves forecasting short-term electricity demand, an important field of application in Smart Grids in this machine-learning era. This study aims to develop a forecasting model using a machine learning approach to predict hourly electricity demand. A real case study of Panama’s power system is presented to validate the model. This case study was significant for understanding how short-term forecasting can help energy managers deal with the day-to-day operations of large-scale power systems. We experimented with several machine learning models such as SVR, Random Forest, XGBoost, Light Gradient Boosting Machine, Adaptive Boosting, Bi-LSTM, GRU, and a deep learning regression model. The contributions of this paper are the following. First, this paper experimented with a large dataset from 2016 to 2019 to test and evaluate the performance of several models for forecasting electricity demand. Second, we incorporated important features for predicting electricity demand, such as temperature, relative humidity, and time lags. The results indicated that these features were significant for improving forecasting accuracy. Third, we evaluated the performance of two well-known deep learning models based on Bi-LSTM and GRU for predicting electricity demand in multiple time steps. This paper is organized as follows:
Section 2 provides a high-level overview of the framework and describes the different methods used.
Section 3 provides a more detailed description of the case study and the implementation of the framework with the case study that includes data collection, data analysis, and model architecture.
Section 4 discusses the results obtained. Finally,
Section 5 presents the conclusions of this study.
3. Case Study
Panama’s electric grid is a complex system with the features mentioned above. Therefore, it will be used as a case study to help us understand the complexity of managing large-scale power systems. Panama is a relatively small country with a population of more than 4.2 million. It is described as one of the fastest-growing economies in Latin America. Panama’s electric grid has been described as a reliable system that has expanded its network capacity to meet its growing consumer demand over the past years. Panama’s electric grid is undergoing a rapid transformation as it starts integrating more renewable wind and solar into the grid. Panama has set high goals to promote more renewable energy projects to reduce environmental impact and contribute to global sustainability. For years, the country has relied on a balance of hydroelectric and thermal power plants to meet consumer demand. However, thermal power plants present a disadvantage that they have high emissions and operating costs; therefore, the grid cannot entirely depend on these sources. Hydroelectric plants also create a problem during the dry season since there is insufficient water to fill the reservoirs, producing less energy. Therefore, Panama has decided to diversify its energy matrix with more sustainable energy sources such as wind, solar, and natural gas to meet customer needs.
Similar to many countries worldwide, Panama faces challenges operating the power grid reliably and sustainably due to changes in supply and demand patterns. The electric grid has evolved into a more complex network of energy suppliers that serves a wide range of growing consumers, such as residential, commercial, and large clients with different consumption patterns. According to the National Secretary of Energy, Panama has an average of 1,152,300 electricity clients as of 2019. The electricity demand has exhibited an upward trend over the years, driven by the increase in population and foreign investments that have boosted the country’s economic growth.
3.1. Data Collected for the Case Study
The data collection process involved open relationships with several entities in Panama. Some of the data were available to the public, while these organizations provided others. Different types of information were collected to build and validate the models, as demonstrated in
Table 1.
This study collected historical data on electricity demand for Panama’s power system from January 2016 to October 2019. The data were provided as an excel file containing 33,600 data points of hourly demand collected from the commercial measurement systems. Each data point represents the total hourly demand of Panama’s different electricity consumption sectors, including residential, industrial, commercial, big clients, government use, and others.
3.2. Data Analysis
A boxplot was constructed to observe the hourly distribution of the electricity demand.
Figure 8 presents the boxplot of the average hourly electricity demand from 2016 to 2019. The electricity demand varies throughout the day, with a prolonged peak period. For example,
Figure 8 below shows that the peak period occurs between the 12th hour (noon) and the 15th hour (3:00 p.m.).
3.3. Feature Selection
Several input features were studied and evaluated for this study to understand the most significant for predicting electricity demand. A total of eight input features were studied for predicting electricity demand one hour and 24 h ahead, shown in
Table 2.
3.4. Correlation Heatmap
The correlation heatmap is another data exploration tool that helps visualize which features are highly correlated with the electricity demand. For example, based on
Figure 9, it can be observed that electricity demand has a strong linear relationship with the following variables: the previous week’s same day same hour load (0.89) and the previous day’s same-hour load (0.8); and a moderate relationship with temperature (0.69).
3.5. Feature Importance
In addition to the correlation heatmap, the Random Forest Regressor was another tool to evaluate feature importance. This built-in tool from the Scikit-learn package is useful for computing feature importance. The results are demonstrated in
Figure 10. Once again, the most significant features were the previous week’s same day same hour load (0.72), the previous day’s same-hour load (0.14), and temperature (0.04).
3.6. Building and Training of Models
This study compared and benchmarked several machine learning and deep learning models to predict short-term electricity demand in Panama. Machine learning methods included SVR, XGBoost, AdaBoost, random forest, and LightGBM. On the other side, deep learning methods consisted of deep learning regression, Bi-LSTM, and GRU. As part of this study, it was essential to investigate the performance of Bi-LSTM and GRU networks for making multiple time step predictions 24 h ahead. The models were built and trained using open-source software such as Knime and Anaconda Python (
https://www.anaconda.com/products/distribution (accessed on 10 September 2021)). The experiments were conducted using a Dell Inc. (Round Rock, TX, USA) Inspiron 15 7000 laptop with Intel
® Core™ i7-8565U
[email protected] GHz, 64-bit Windows 10 operating system, and 8 GB memory. Most models were built in Python 3.7.6 and Keras with Tensorflow as the backend.
Table 3 provides the input features that were used for predicting electricity demand.
3.7. Data Partitioning and Model Architecture for Machine Learning Models
The data were split into a training (80%) and test set (20%) while maintaining the temporal order of the data. The data from 1 January 2016, to 25 January 2019, were used as the training set. The data from 26 January 2019, to 31 October 2019, were used as the test set.
Table 4 presents the model architecture for the machine learning models using several methods from the literature. For the random forest model, important parameters such as the number of decision trees used were set to 100, the minimum number of samples required to be a leaf node was set to 1, and the minimum number of samples required to split an internal node was set to 2.
3.8. Model Architecture for Deep Learning Models
It was important for the deep learning models to define the hyperparameters, such as the number of dense layers and hidden units, learning rate, activation function, batch size, and epochs.
3.8.1. Deep Learning Regression
Table 5 presents the architecture for building the deep-learning regression model in Knime. To effectively learn the complex nonlinear patterns and relationship between the several input features and the output (demand), a neural network of three dense layers with 95 hidden units each was considered. The model was trained for 500 epochs with a batch size of 50. The Stochastic Gradient Descent optimizer was selected with a learning rate of 0.01.
3.8.2. Bi-LSTM
The Bi-LSTM model architecture consists of three stacked Bi-LSTM layers, with 70 hidden units, each followed by a dense layer of 24 hidden units (
Table 6). The model was trained for 500 epochs with small batch sizes of 30. The model receives an input sequence of the 48 previous electricity demand hours comprising seven input features.
3.8.3. GRU
The GRU model architecture consists of three stacked GRU layers, with 80 hidden units each, followed by a dense layer of 24 hidden units (
Table 7). The model was trained for 500 epochs with small batch sizes of 30.
5. Conclusions
Electricity forecasting is essential in supporting the reliable transitioning of power systems in this rapid digital era. The advances in big data, IoT, and machine learning have provided researchers and the industry with numerous opportunities to support more robust forecasting. However, challenges still exist for delivering more accurate forecasts due to the granularity and quality of the data collected from sensors and SCADA systems, the nonlinear and noisy patterns presented in the data, and the complex features that affect it.
To validate the methodology, this research introduced a case study on Panama’s power system. This case study was significant to understanding where power systems currently stand, their challenges, and how they are beginning to prepare for the future. The case study revealed that energy managers are becoming more concerned about the grid’s reliability.
The methodology first addressed two research questions: (1) Which features are the most significant for predicting electricity demand in the short term? Additionally, (2) Which methods are the most effective for capturing hourly demand? Therefore, we evaluated nine input features for forecasting hourly demand. These were the month, day of the week, the hour of the day, the previous 24 h average load, working day/weekend indicator, temperature (°C), relative humidity, previous day’s same-hour load, and previous week’s same day’s same-hour load. Feature importance based on random forest regressor revealed that the most significant features were the previous week’s same day same-hour load (0.72), the previous day same-hour load (0.14), and temperature (0.04). Several models were proposed for the complex nonlinear mapping between the nine input features and electricity demand (target variable). The deep learning regression model performed the best for predicting demand one hour ahead, with an R2 value of 0.93 and MAPE of 2.90%. The reason behind this was that the deep learning regression model uses a more robust approach by stacking multiple hidden layers, allowing it to learn complex patterns presented in the data.
Furthermore, deep learning tends to perform better when trained with large datasets. Therefore, to improve the predictive performance, we used a deep learning model consisting of three dense layers with 95 hidden units each. Unfortunately, the model required an average training time of 20 min due to the number of hidden layers used. Among the deep learning models, the GRU multi-step model performed the worse because it uses a long input sequence of 72 h to predict electricity demand 24 h ahead, which can lead to the vanishing gradient problem. Therefore, it became evident that multi-step prediction problems remain a challenging research area. The study also found that AdaBoost had the worst performance among the machine learning models, with an R2 value of 0.75 and MAPE of 5.70%.