1.1. Motivation and Background
Thailand has 100% access to electricity [
1] for both urban and rural areas with the second-largest economy and the fourth-largest country by population in Southeast Asia. Such population and the economic growth lead to an increment of electricity demand by an annual average of 684 MW since 1987 [
2]. Among Asian countries, the People’s Republic of China (PRC) accounts for most of the energy demand in Asia whereas Thailand stands in the fifth followed after India, Korea, and Indonesia [
3]. The Electric Supply Industry of Thailand consists of Electricity Generating Authority of Thailand (EGAT), Metropolitan Electricity Authority (MEA), and Provincial Electricity Authority (PEA); where EGAT is responsible for generation, MEA for distribution in the metropolitan area around Bangkok, and PEA distribution in the rest of the country. The scope of our paper is centered on the metropolitan areas.
Short-term electricity demand load forecasting has become an extremely important issue for energy suppliers, system operators, and other market participants. In many electricity deregulated markets, around the world electricity demand is fixed a day before the delivery by concurrent (semi-)hourly auctions. The improved accuracy of demand load forecasting can save the operating cost and the reliability of power supply. The volatile and complex behavior of electricity demand is always a challenging task when developing robust models for the researcher. Smart technologies like the internet of things, need sufficient power to connects “anything from anywhere” [
4]. Industry automation and large penetration of plug-in electric vehicles are expected for reliable, economically competitive, environmentally sustainable electric system [
4,
5]. The fifth assessment report of the Intergovernmental Panel on Climate Change [
6] suggested that global warming already made the world 0.74
C warmer and forecasted to 1.8–4
C by the end of century [
7,
8,
9]. This increment in temperature has already shifted the highest peak demand load occurrence from evening hours to daytime hours due to the intensive use of the air conditioning (AC) systems in Jordan [
7]. The outside temperature is demonstrated as the most influencing factor among all the variables that can be used to explain potential variations in demand [
10,
11]. Furthermore, this influence is typically linked to the penetration of electric cooling and heating appliances, mainly used by households and firms in residential and commercial sectors [
11]. Heating, ventilation, and the AC system consume more than half of energy in the buildings of residential area [
12]. In addition to temperature, deterministic variables such as day types and seasons also show their impacts on demand. The behavior of demand is highly dependent on the sectors where the electricity is used. The influence of residential sector behavior on energy consumption is getting increasingly significant [
13]. Since the electricity cannot be stored efficiently, end-user demand must be tightly controlled for an effective power system management [
14]. Therefore, the main objective of this article is to construct an accurate forecasting model, and quantitative analysis of demand load influencing factors that help to ensure the stability of the energy system in operation, maintain the secure, adequate, and efficient electricity supply by reducing blackout risk. To avoid an ambiguous presentation, we note that the rest of this paper uses two terms, “demand forecasting” and “demand” to refer to “electricity demand forecasting” and “electricity demand”.
In fact, electricity consumption fluctuates due to the random behavior of consumers while some of the predictable patterns such as seasonal, weekly, and daily patterns are found. The variation on the demand during morning hours, day hours, and night hours causes intraday patterns. To handle these intraday periodicities, the researcher has two alternatives: the first is to allocate the individual variables for each hour/half-hour [
15,
16], while the second is to construct separate models for each hour/half-hour so that intraday seasonality vanishes [
17,
18,
19]. Similarly, weekdays seasonality is due to working and non-working days. To remove such seasonality, a similar methodology of individual variables for each day using dummy variables is implemented [
17,
18,
19,
20]. Annual seasonality occurs due to the hot and cold seasons, and dummy variables are used to remove such seasonality. Monthly seasonality is also taken care of by dummy variables. To improve the forecasting accuracy, the interaction of these seasonal variables is implemented in [
15] for the Thai dataset.
Demand load forecasting with a linear model is challenging due to several levels of seasonality, non-linear factors such as temperature, holiday, and special events. Auto-regressive moving average (ARMA) structures can handle such seasonal and cyclic behavior. So the lag structure-based ARMA with exogenous variables (ARMAX) models are constructed similar to [
16,
17,
20,
21] because such models have been extensively applied in the demand forecasting literature for better accuracy [
16]. The main limitation of ARMAX is over-forecasting for Saturdays and under-forecasting for Mondays [
21]. The reason behind the over-forecasting on Saturday is due to higher demand on Fridays compared to Saturdays. Since one-day lag demand is used to forecast for Saturday, this causes the over-forecasting on Saturdays. Similarly, the possibility of under-forecasting on Mondays is due to the Saturday or Sunday demand based on one-day or two-day lagged demand. A simple way to deal with such an issue is the interaction variables between lagged demand with day-of-the-week dummy variables which is implied in our modeling.
1.3. Model Selection
Several modeling concepts for robust parameter estimation used prior to 1990 were discussed in [
53], and concluded that MLR was superior [
19,
21,
54]. Literature showed that the multiple equations approach has the potential to achieve a very competitive forecast accuracy. The main advantage of this approach is the interpretative capability of explanatory factors. Since the primary goal of this paper is to analyze the marginal impact of temperature on electricity demand, the construction of regression models is a good option [
21]. However, the characteristics of electricity demand are highly non-linear. Due to the handling capability of artificial intelligence especially: ANN, fuzzy and Support Vector Machine (SVM) are popular among the researchers. For example, ANN-based methods [
46,
55,
56], Fuzzy interaction regression methods [
57], ANN with Particle Swarm Optimization (PSO) and GA [
24] to optimized the weights to fall into local minima, SVM methods [
23], Baysian methods [
19], DNN methods [
58] are implemented.
ANN-based STDF models are systematically summarized including the important critics on ANN by Hippert et al. [
56]. They exposed that the existing ANN papers were claimed better performance without any solid support. An FNN: a simple deep neural network consists of more than the typical three layers of multiple layer perceptron. The deep structure increases the feature abstraction capability of neural networks. The number of layers and neurons are the key to modeling neural network structures. The network, having a single hidden layer can vary the number of neurons in a single hidden layer while the depth of the network can be varied in DNN. The number of hidden layers is selected based on forecasting accuracy. Darshana and Chawalit [
50] designed a combined PSO algorithm and forecasted using the ANN technique. The overall MAPE for 2013 was found higher than
. The same authors [
24] implemented hybrid PSO with the Genetic Algorithm (GA) to improve the yearly MAPE for 2013 forecasting accuracy to
which is quite impressive for Thai data. The minimum and the maximum monthly average MAPE were found 2.164% (April) and 6.761% (December), respectively. The procedure of cleaning the dataset of this paper consists the main limitation; where the dataset of holidays, bridging days (working days between holidays), and the abnormal dataset is replaced with a weighted moving average of by randomly selected numbers and forecasting is performed on cleaning dataset, not the actual demand. Phyo et al. [
22] also suggested the data be cleaned and grouped into similar days. The authors implemented a DNN methodology to forecast the year 2013. To improve the prediction accuracy, paper [
22] is also tested with cleaned data similar to [
24,
50]. Su et al. [
23] presented a similar methodology but a different algorithm. In both papers [
22,
23] length of training dataset and the temperature variables were equal and focused on cleansing to obtain better forecasting accuracy. However, their accuracy performance was still at a lower level compared to [
24]. In this study, we used the same dataset provided by EGAT which was already implemented by papers [
20,
22,
23,
24,
50] is implemented.
Due to limited literature for Thai data, similar studies of different regions and similar weather condition as Thailand are worthy of discussion. For example, Malaysia has approximately similar weather conditions to Thailand. Ismail et al. [
54] implemented an MLR model to investigate the impact of temperature, holiday types with MAPE
for a day-ahead forecast. Apart from artificial intelligence, other traditional and adaptive techniques such as Seasonal ARIMA (SARIMA) and Regregression ARIMA (RegARIMA) were compared to linear regression for cognate energy prediction with weather variable selection. The result showed that linear regression was highly effective and better than other sophisticated techniques for the majority of simulations in [
59] in China. A RegSARIMA model for predicting short-term daily peak demand with a comparative analysis between SARIMA and Holt–Winters Triple (HWT) exponential smoothing models were discussed by Chikobvu and Sigauke [
60] for South Africa. The empirical results showed that the RegSARIMA model is capable of capturing important driving factors of demand. In another study of authors [
60,
61], an additive regression model used to forecast the daily peak demand. They concluded that the demand in South Africa was highly sensitive to cold temperatures compared to hot temperatures because of sub-tropical mild climate. The small number of customers is another reason for the high sensitivity of temperature to the electricity demand discussed by Haben et al. [
62] where they have studied for a low voltage grid up to 150 customers in the United Kingdom.
As usual, the result shows the correlation between demand and seasonality (temperature) where the best performing method was AR and HWT exponential smoothing. More robust modeling techniques comprise our concern to handle the stochastic behavior of variables. The use of categorical variables is needed to formulate such classification in a linear regression model is very common [
17,
21,
30,
44]. These categorical variables define the types of days translated into dummy variables that allow the regression model to eliminate each type of individual day. In the short lead time up to six hours ahead, univariate models are sufficient [
16] for good accuracy. Taylor et al. [
16] conducted a comparative study of univariate models against two benchmark models to obtained the MAPE less than
. The hidden reasons behind the selection of univariate models are just a few leading hours for prediction and the difficulties in accessing the costlier weather data [
19,
63]. Unlike [
16], EGAT makes forecast at 2 p.m. for the next day which is 10 to 34 leading hours. On Friday, Thailand practice for the forecast is up to 106 h ahead, if Monday is a holiday and even longer during long holidays such as Songkran and New Year. Such long leading hour prediction is made by EGAT because the EGAT office is closed on weekends and holidays. However, this study is limited to the prediction up to 10 to 34 leading hours considering the data up to 2 p.m. is available to forecast for the next day only.
The MLR model with a dynamic error structure and adaptive adjustment of forecasted error proposed by Ramanathan et al. [
17] was the winner of a demand forecasting competition. The reason behind the dynamic error structure is to overcome the limitation of simple OLS. The alternative model of [
17] is the quantile regression model which is discussed and implemented by Sigauke et al. [
64], and Botoc and Anton [
65], for hourly electricity demand model, and to explain the profitability of firms, respectively. The Hong et al. [
57] reviewed the modeling techniques of winning teams in the Global Energy Forecasting Competition 2012, where all four winning teams applied regression analysis, while only two teams implemented ANN. Therefore, we select the AR-based MLR model with uncorrelated error and adaptive adjustment for correlated error. For a comprehensive study, FF-ANN and hereafter named FF-ANN model are constructed using the TensorFlow deep learning platform.
1.4. Contributions
This paper follows the similar forecasting procedure of EGAT where the data until 2 p.m. is collected and predicted for the next day. The selection and implementation of the GLSAR model by taking care of error lags, a comprehensive study on OLS, GLSAR and FF-ANN, provides the novelty and pioneer contribution for an upcoming researcher. The major findings based on Thai electricity data for STDF as explained as,
The marginal impact of temperature that leads to raising the demand for day hours and night hours is explored for Thailand which is quite useful for tropical countries.
The quantitative analysis among the variables such as the impact of holidays, working days, working days after a holiday/long holiday, AR effect, special days/events such as Bangkok flood for the demand is discussed in detail.
The unexpected Bangkok flood and lockdown situation were quite similar to the current Covid-19 in terms of electricity demand. Therefore, the researcher can extend this methodology to analyze the impact on electricity due to Covid-19.
Construction of four different scenarios based on similar characteristics of demand which leads to achieving the best prediction capability among the existing literature of the Thai dataset.
The strategy for the selection of variables, determination of the training length of a dataset, hidden layers and nodes are also major contributions for the improvement of the accuracy are also major contributions of this study.
The remainder of this paper is organized as follows.
Section 2 describes the overall methods including data pre-processing, modeling strategy, and estimation process of models.
Section 3 demonstrates an extensive empirical analysis of forecasting accuracy, the marginal impact of temperature and the quality of the model fit.
Section 4 summaries the comprehensive discussion and concludes this paper.