**About the Editors**

#### **Snezhana Gocheva-Ilieva**

Snezhana Gocheva-Ilieva, D.Sc., Ph.D., is currently a Guest Professor in the Department of Mathematical Analysis at the University of Plovdiv Paisii Hilendarski, Bulgaria. She received her M.Sc. degree and B.Sc. degree in Computational Mathematics from Sofia University "St. Kliment Ohridski" and received her Ph.D. degree in Physics and Mathematics from Taras Shevchenko National University of Kyiv. She received her Doctor of Sciences degree in Mathematics in 2016. From 1985 to 2011, Prof. Gocheva-Ilieva was an Associate Professor at the University of Plovdiv Paisii Hilendarski and has been a Full Professor here since 2011. She is a member of the Union of Bulgarian Mathematicians, and American Mathematical Society and is also a reviewer of Mathematical Reviews, Zentralblatt fur Mathematik, and ACM. Her primary research interests lie in ¨ modeling and simulations of plasma and laser physics and engineering, modeling in environmental science, theories and applications of finite difference methods for ordinary and partial differential equations, singular problems, applied computational statistics, applications of predictive data mining techniques, and machine learning. Her secondary interests fall into scientific computing, ICT, programming, and innovative technologies in mathematics education.

#### **Atanas Ivanov**

Atanas Ivanov, Ph.D., is currently an Associate Professor in the Department of Mathematical Analysis at the University of Plovdiv Paisii Hilendarski, Bulgaria. He received his Ph.D. degree from the same University in 2016. Dr. Ivanov researches data analysis, predictive modeling, time series analysis, and their applications in various areas.

#### **Hristina Kulina**

Hristina Kulina, Ph.D., is an Associate Professor in the Department of Mathematical Analysis at the University of Plovdiv Paisii Hilendarski, Bulgaria. She received her M.Sc. degree in 1993 from the University of Plovdiv Paisii Hilendarski and her Ph.D. degree in 2013 from the Bulgarian Academy of Sciences. Dr. Kulina has been an Associate Professor since 2016. She is currently a member of the Union of Bulgarian Mathematicians. Her primary research interests are in algebra, especially of orthogonal arrays and spherical designs. Her the current interests are focused on data analyses and applications of predictive machine learning techniques.

**Snezhana Gocheva-Ilieva \*, Atanas Ivanov and Hristina Kulina**

Department of Mathematical Analysis, University of Plovdiv Paisii Hilendarski, 4000 Plovdiv, Bulgaria; aivanov@uni-plovdiv.bg (A.I.); kulina@uni-plovdiv.bg (H.K.)

**\*** Correspondence: snow@uni-plovdiv.bg

Currently, we are witnessing rapid progress and synergy between mathematics and computer science. This interaction leads to a greater effect in solving theoretical and applied problems. In this context, and following the good results of the first Special Issue, "Statistical Data Modeling and Machine Learning with Applications", a second edition covering the same 15 topics was announced at the end of 2021. The present Special Issue (SI), like the first, concerns the section "Mathematics and Computer Science". In total, 35 manuscripts were submitted for review. Of these, after a strict peer-review process by at least three anonymous reviewers, 15 articles were accepted and published.

Study [1] proposes effective models for forecasting the maximum hourly electricity consumption per day in Slovakia. Four types of models were built: gray models (GM(1,1)), nonlinear gray Bernoulli models (NGBM(1,1)), one ANN (based on a multi-layer feedforward back-propagation (MLPFFBP) network), and a hybrid model. This approach includes a pre-processed data series that is used to build the transverse set of gray models, construct a special validation process for the MLPFFBP-ANN, and create a weighted hybrid model with GM(1,1) and the ANN. According to the three criteria, the models of the GM(1,1) set, ANN, and hybrid model reported better accuracy in forecasting values than officially provided forecasts, as the hybrid model has the best indicators.

In [2], a new simplified selective algorithm is proposed to increase the efficiency of ensemble methods based on decision trees and the index of agreement. This approach was demonstrated on real-world data to predict the 305-day milk yield of Holstein–Friesian cows. Using rotated principal components, classification and regression tree (CART) ensembles and bagging, and Arcing methods, a 30% reduction in the number of trees of the constructed selective ensembles was achieved. In addition, hybrid linear stacked models were built, yielding a 13.6% reduction in test set prediction errors compared to hybrid models with the nonselective ensembles.

The aim of paper [3] was to create an effective approach to detect and counter cyberattacks on Internet of Vehicular networks (IoV). An innovative, explainable neural network (xNN) model based on deep learning (DL), and in particular, Denial of Service (DoS) assaults, has been developed. To build the model, K-means was first applied for clustering, classification, and extraction of the best features for anomaly detection. After that, the xNN model was built to classify attacks. The model was tested on the two known empirical datasets, CICIDS2019 and UNSW-NB15. The calculated statistical indicators showed that the proposed feature-scoring approach outperforms the known published results in this field.

Publication [4] deals with single-index quantile regression (SIQR), a type of semiparametric quantile regression for analyzing heterogeneous data. The quantile regression method with the SCAD penalty and Laplace error penalty were used to construct two sparse estimators for the considered SIQR. This leads to an efficient procedure for variable selection and parameter estimation. Theorems were proved for the N-consistency and oracle properties of the proposed estimators. Computer simulations with benchmark data

**Citation:** Gocheva-Ilieva, S.; Ivanov, A.; Kulina, H. Special Issue "Statistical Data Modeling and Machine Learning with Applications II". *Mathematics* **2023**, *11*, 2775. https://doi.org/ 10.3390/math11122775

Received: 15 June 2023 Accepted: 16 June 2023 Published: 20 June 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

1

samples were performed. The method was shown to exhibit some resistance to heavy-tail errors and outliers while increasing the accuracy of parameter estimates.

Paper [5] presents a new computationally and highly efficient hybrid Bayesian network training algorithm called Forward with Early Dropping Hill Climbing (FEDHC), which is applicable to continuous or categorical variables. The algorithm applies the forward– backward-with-early-dropout (FBED) variable selection to each variable as a means of skeleton identification, followed by a hill-climb (HC) scoring phase. Another advantage of the proposed version of FEDHC is its robustness against outliers. FEDHC, PC Hill Climbing (PCHC), and Max–Min Hill Climbing (MMHC) were illustrated on two real cross-sectional datasets. A new, computationally efficient implementation of MMHC was also suggested.

In [6] the problem of estimating the graphs of conditional dependencies between variables under Gaussian settings is investigated. The authors present an improved Jewel 2.0 version of their previous Jewel 1.0 method. This was achieved on the basis of regressionbased problem formulation with the appropriate minimization algorithm. Other contribution of the work is the proposed stability selection procedure that reduces the number of false positive scores in the estimated graphs. Simulation experiments were conducted.

The authors of [7] applied nonlinear autoregressive exogenous (NARX) networks coupled with an optimizing algorithm for wavelet filtering for modeling long-term dependencies and anomaly detection in noisy and nonstationary time series. A procedure using wavelet packets and stochastic thresholds was developed to approximate the decomposed components of the original data. The suggested wavelet filtering allows for the construction of a more accurate predictive NARX model. In addition, the NARX model was applied for anomaly detection. The results are demonstrated for ionospheric parameter time series prediction and ionospheric anomaly detection.

In paper [8], a method was developed for estimating the consolation prize of a slot machine jackpot using multidimensional integrals. Various modifications of the stochastic quasi-Monte Carlo approaches, such as lattice and digital sequences, Halton and Sobol sequences, and Latin hypercube sampling, were used to calculate the integrals. The expectations of the real consolation prize were evaluated, depending on the distribution of time and the number of players. The method was generalized for a multidimensional case. Computational experiments were performed.

Article [9] presents new theoretical and applied results in stochastic processes in spatial kinematics and line geometry for modeling some characteristics of 3D surfaces. The authors introduced theoretical principles on line-element geometry, kinematic surfaces, and the Gaussian process latent variable model (GPLVM). A method for surface approximation, unsupervised surface segmentation, and surface denoising in 3D modeling was described, which was based on the Bayesian GPLVM and the GPLVM with back constraints. The results were illustrated on sets with artificial and real-world objects.

In [10], a new method that aggregates five machine learning (ML) methods from different classification groups and a binary regression algorithm is proposed. The realworld task of predicting the impact of meteorological factors on the appearance of traffic accidents was solved. The most significant meteorological factors for road accidents were identified. The model was implemented as one of the agents in a two-agent system: agent 1 draws knowledge through ML from historically available data, and agent 2 deals with the same parameters, but in real-time. The suggested two-agent system can be implemented for providing early-warning alerts to citizens and traffic police, including through social media platforms.

The authors of [11] developed a novel, general multi-step-ahead strategy to forecast time series of air pollutants, extending the known multiple-input multiple-output (MIMO) strategy. The suggested strategy presupposes the availability of external independent forecasts for meteorological, atmospheric, and other variables, and continuously updated datasets. A new computational scheme was proposed for h-vector horizon prediction for each forward step. The strategy was applied to forecast the daily concentrations of pollutants PM10, SO2, and NO2 17 horizons ahead, with h = 10 days. Random forest (RF) and arcing (Arc-x4) ML algorithms were used for modeling. The comparison with the existing strategies showed the advantage of the proposed one.

Paper [12] presents a novel credit card fraud detection scheme, RaKShA, which is integrated with explainable artificial intelligence (XAI) and long short-term memory (LSTM), i.e., the X-LSTM model, and the output is verified via a smart contract (SC). The results are stored in the InterPlanetary File System (IPFS), which is referenced on the public blockchain network. The proposed approach addressed the limitation of traditional fraud detection by providing model interpretability, improved accuracy, security, and transparency. The X-LSTM model was found to increase the power of the LSTM model in detecting credit card financial fraud (FF) and to make the scheme scalable and adaptable, which helps users to protect themselves from FF.

Paper [13] presents an efficient one-stage model for automatic lung tumor detection in computed tomography (CT) images, called ELCT-YOLO. It was designed to solve the problem of scales and meet the requirements of real-time tumor detection. The ELCT-YOLO model implemented a specially designed neck structure and a novel Cascaded Refinement Scheme (CRS) to process context information. The results of empirical tests showing the advantages of the model were presented.

In [14] a Light Gradient Boosting Machine (LightGBM) model is utilized to classify and predict leisure time. The SHapley Additive exPlanation (SHAP) approach was applied to conduct feature importance analysis and influence mechanism analysis of factors from four perspectives: time allocation, demographics, occupation, and family characteristics. The results verified that the LightGBM model effectively predicts personal leisure time.

A two-layer autoencoder neural network architecture, singular-spectrum analysis (SSA) decomposition, and an adaptive anomaly detection algorithm (AADA) were used in [15] to process natural data of a complex, noisy nature. The AADA includes wavelet transforms whose accuracy is set with appropriate thresholds. These methods were applied for the analysis and detection of anomalous decreases that occurred before the geomagnetic disturbances. High-performance hybrid models were built for the study of cosmic ray data. The hybrid SSA-AADA models reach about 84% efficiency in anomaly detection, while the Autoencoder-AADA models reach about 87%.

To summarize, we should emphasize that the results of the published articles fully correspond to the formulated goal and topics of the SI, "Statistical Data Modeling and Machine Learning with Applications II". Their main contributions are classified in Table 1. We can note that the selected 15 topics are well covered. The hybrid models, machine learning algorithms, and nonparametric statistical modeling received the greatest interest.


**Table 1.** Classification by topics of the main contributions of the articles published in the SI.

In conclusion, new mathematical methods and approaches, new algorithms and research frameworks, and their applications aimed at solving diverse and nontrivial practical problems are proposed and developed in this SI. We believe that the chosen topics and results are attractive and useful for the international scientific community and will contribute to further research in the field of statistical data modeling and machine learning.

**Funding:** This research received no external funding.

**Acknowledgments:** The research activity of the first Guest Editor of this Special Issue was conducted within the framework of and was partially supported by the MES (Grant No. D01-168/28.07.2022) for NCDSC, as part of the Bulgarian National Roadmap on RIs.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Forecasting of Electrical Energy Consumption in Slovakia**

**Michal Pavlicko \*, Mária Vojteková and Ol'ga Blažeková**

Department of Quantitative Methods and Economic Informatics, Faculty of Operation and Economics of Transport and Communications, University of Žilina, Univerzitná 1, 01026 Žilina, Slovakia; maria.vojtekova@uniza.sk (M.V.); olga.blazekova@uniza.sk (O.B.) **\*** Correspondence: michal.pavlicko@uniza.sk; Tel.: +421-41-513-3274

**Abstract:** Prediction of electricity energy consumption plays a crucial role in the electric power industry. Accurate forecasting is essential for electricity supply policies. A characteristic feature of electrical energy is the need to ensure a constant balance between consumption and electricity production, whereas electricity cannot be stored in significant quantities, nor is it easy to transport. Electricity consumption generally has a stochastic behavior that makes it hard to predict. The main goal of this study is to propose the forecasting models to predict the maximum hourly electricity consumption per day that is more accurate than the official load prediction of the Slovak Distribution Company. Different models are proposed and compared. The first model group is based on the transverse set of Grey models and Nonlinear Grey Bernoulli models and the second approach is based on a multi-layer feed-forward back-propagation network. Moreover, a new potential hybrid model combining these different approaches is used to forecast the maximum hourly electricity consumption per day. Various performance metrics are adopted to evaluate the performance and effectiveness of models. All the proposed models achieved more accurate predictions than the official load prediction, while the hybrid model offered the best results according to performance metrics and supported the legitimacy of this research.

**Keywords:** forecasting model; electricity energy consumption; grey model; artificial neural network

**Citation:** Pavlicko, M.; Vojteková, M.; Blažeková, O. Forecasting of Electrical Energy Consumption in Slovakia. *Mathematics* **2022**, *10*, 577. https://doi.org/10.3390/ math10040577

Academic Editor: Snezhana Gocheva-Ilieva

Received: 14 January 2022 Accepted: 10 February 2022 Published: 12 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Precise electrical energy consumption (EEC) forecasting is of great importance for the electric power industry. Electricity consumption can be directly or indirectly affected by various parameters. Such parameters include previous data of consumption, weather, population, industry, transportation, gross domestic product, and so on. Electricity consumption generally has a stochastic behavior that makes it hard to predict. Consumption overestimation would lead to redundant idle capacity, which would be a waste of funds. Underestimating, on the other hand, would lead to higher operating costs for energy suppliers and potential energy outages. Hence, precise forecasting of electricity consumption is crucial to avoid costly errors.

Another specificity of energy is its liberalization, which has led to a gradual separation of electricity generation, its distribution, and trading of this commodity. Thus, the ability to predict for most electricity market participants has become absolutely crucial for their functioning in a market environment.

For electricity producers, the essential information is how much electricity they will be able to deliver in the future, what their production costs will be, and at what price they will sell electricity on the market. The production of conventional sources is easier to plan and predict compared to the not entirely stable and intermittent production of photovoltaic and wind power plants.

#### *Energy Profile of Slovakia*

As the standard of living increases, the need for electricity grows not only in Slovakia but throughout the European Union. A significant share of the increase in consumption is due to the industry, which by introducing automation and robotization, has higher demands on electricity. Communal household consumption is also increasing due to the purchase of new electrical appliances. In forecasting the development of electricity consumption in the coming years, it will be necessary to consider the increase in transport consumption (expected by the massive production of electric vehicles), but at the same time consumption trends in electrical appliances and light sources in line with EU objectives.

In 2019, approximately 53.7% of the total production of electricity in Slovakia was obtained from nuclear power stations. The second biggest share was from fossil fuels (21.7%), and the third one was from waterpower (16.1%). The share from renewable sources (RES) was only 8.1%. Natural gas has the largest share (49.6%) in the production of electricity from fossil fuels, followed by brown coal (22.8%), and black coal (14.2%). The electricity production from RES was mostly from biomass (48.6%) and biogas and photovoltaic power plants participated by one-fourth (24.5% and 25.2%). The decrease in electricity production from RES in 2019 compared to 2018 was recorded in the case of biomass (95.3%) and biogas (95.2%). A similar decrease was documented for photovoltaic power plants (100.2%). Hydropower plants, on the other hand, recorded a significant increase in electricity production (117.7%) [1]. Electricity production by source in the period 2000–2020 is illustrated in Figure 1.

**Figure 1.** Electricity production by source in Slovakia [2].

After a significant decrease in electricity production in 2018, its production grew substantially in 2019. The amount of electricity produced from Slovak sources increased by 1460 GWh compared to 2018 and achieved 28,610 GWh in 2019. In 2016, electricity consumption crossed 30,000 GWh for the first time in history. In the following years, the consumption grew continuously until the culmination in 2018. However, in 2019 electricity consumption achieved 30,309 GWh, which means a considerable decrease compared to the previous year (−637 GWh, year-to-year index of 97.9%) and in 2020 the value was again below the level of 30,000 GWh [1]. The amounts of electricity production and consumption in Slovakia during the years 2000–2020 are listed in Figure 2.

**Figure 2.** Gross electricity production and consumption in Slovakia [3].

The electricity system of the Slovak Republic (ES SR) is operated in parallel within the European Network of Transmission System Operators (ENTSO-E). Since 2015, SEPS (Slovenská Elektrizaˇcná Prenosová Sústava, a. s.) is the only operator of the transmission system in the Slovak Republic. SEPS is the national provider of transmission and system services. Other key activities include providing auxiliary services and controlling the transmission system components as a dispatcher. It also serves as a facility providing support services and supplies of regulating electricity obtained within the Grid Control Cooperation (GCC) [1].

By initiation of auxiliary services, SEPS ensures a balance between the electricity production and consumption in Slovakia. This endeavor has to consider various circumstances and consider concluded agreements in the field of international electricity exchange.

Since 2011, company OKTE, a. s. serves as a coordinator of the short-term electricity market. It participates in all activities associated with the development, implementation, and operation of the single coupling of cross-border, day-ahead, and intraday markets in electricity within the European Union. Additionally, OKTE, a. s., evaluates and resolves disparities between demand aggregated and local distribution system deliveries [1].

The cross-border flows balance of ES SR has been in favor of import direction since 2007. In 2020, a substantial decrease in cross-border transmissions was measured compared to 2019. In the import direction, they were lower in total by 250 GWh, but in the export direction they dropped by 1132 GWh, and the import balance from 2014 achieved the lowest value (318 GWh) [1]. The amounts of imports, exports, and the differences between imports and exports of electricity in Slovakia during the years 2000–2020 are shown in Figure 3.

Electricity cannot be stored in significant quantities, nor is it easy to transport, and production units often face flexibility constraints. This implies an accurate estimate of electricity consumption on an hourly and daily basis.

**Figure 3.** Cross-border electricity transmissions of Slovakia [4].

#### **2. Literature Review**

Various prediction models and methods are broadly explored to accurately forecast the electrical energy consumption because of economic, environmental, and technical reasons. In the field of energy consumption prediction, a variety of different methods have been proposed based on data analysis, including Box–Jenkins models, grey prediction models (GM), fuzzy logic methods, artificial neural networks (ANN), vector regressions, etc. Hybrid models combine features and benefits of parental methods to improve the accuracy of their predictions and help capture different data properties and avoid the limitations of a single method.

The GM's popularity in time series forecasting is probably due to its relative simplicity and capacity to model an unknown system by utilizing a small dataset. In addition to the basic GM(1,1) model, various variants of grey models to improve the accuracy of predicting energy consumption in China were used in [5–16].

Ayvaz in [17] introduced three different grey forecasting models for modeling and predicting yearly net electricity consumption in Turkey. A comparison of ARIMA and GM for electricity consumption demand forecasting in Turkey was applied in [18].

A gross final energy consumption, the energy consumption of renewable energy sources, and its share in France, Germany, Italy, Spain, Turkey, and the United Kingdom were forecasted using optimized fractional nonlinear grey Bernoulli model in [19].

The use of machine learning techniques, especially ANN, has become increasingly popular in many forecasting models, i.e., electricity demand prediction in Spain [20]; shortterm electricity consumption in Spain [21]; an accurate forecast of the exploitable energy from renewable energy sources in Milan, Italy [22]; and a prediction of the 24-h electricity power demand in Poland [23].

Model hybridization or combining several similar or different methods has become a widespread practice to improve prediction accuracy. For example, Hu in [24] combined GM and neural networks to demonstrate the hybrid method effectiveness and forecasted EEC in Turkey. The grey and vector autoregressive models were coupled to improve their accuracy in [25]. The hybridizing support vector regression model with evolutionary algorithms was described in [26]. A comparison of statistical and machine learning models was given, and the use of hybrid models in electric power forecasting was identified in [27].

Regarding literature about forecasting EEC in the Slovak Republic, there are only a few articles. In [28], a comprehensive overview of energy consumption of six Central European countries, including Slovakia, was given. Avdakovic in [29] studied the correlation between air temperature and electricity demand using a linear regression and wavelet coherence approach in UK, Slovakia, and Bosnia and Herzegovina. Laurinec, in [30], presented a method for forecasting a load of individual electricity consumers using smart grid data and clustering. A time series prediction methodology based on deep neural networks was presented in Jarabek's research [31]. A long, short-term memory algorithm with a sequence-to-sequence architecture was used to improve the prediction accuracy of electricity consumption of Slovak enterprises. Oudjana, in [32], proposed a model employing neural networks and particle swarm optimization for a long-term forecast based on historical loads databases of Slovak power systems. Halaš, in [33], used biologically inspired algorithms and compared prediction accuracy of the ensemble learning model to base forecasting methods. Brozyna in [34] focused on the renewable energy and EU 2020 target for energy efficiency in the Czech Republic and Slovakia.

The rest of the paper is organized as follows: the description of data and the basic knowledge of proposed grey models, ANN, and hybrid model is in Section 3; Section 4 focuses on achieved results and introduces all the experimentation implemented; a discussion about the obtained outcomes is detailed in Section 5; and finally, Section 6 compiles the conclusions achieved from this research.

#### **3. Methodology**

The research was developed in three main stages: first, data collection and preprocessing for considered models; second, modeling the problem and implementation of the proposed models; and finally, validation and analysis of the results.

#### *3.1. Data*

The data were acquired via the Damas Energy information system (Damas) provided by SEPS [35]. The data of the official system load prediction was provided also by SEPS and can be found in [35]. It is worth note that the official load prediction is a one-year-ahead prediction usually published at the beginning of the forecasted year. The data provided by Damas contains all hourly consumptions in Slovakia, but the official load predictions contain only values for Wednesdays. For comparison, it was decided to make a prediction model only for Wednesdays. Hence, the maximum hourly electricity consumption for each Wednesday was extracted from the data.

The data from 2010 to 2018 were used for training purposes and the data from 2019 and 2020 were used for evaluation of prediction power, i.e., testing. However, the year 2019, which was not affected by COVID lockdown, was evaluated separately from the year 2020, which was affected by the pandemic. Note that the learning sample was divided into a training and validation subsample in the case of ANN, which is described in detail later in Section 3.3.1.

Electricity consumption is heavily influenced by various individual factors that can be easily determined, such as the time of day, day in the week, season of the year, or whether it is working day or weekend, etc. Therefore, the data were pre-processed for the use of individual models as follows.

#### 3.1.1. Data Pre-Processing for Grey Models GM(1,1) and NGBM(1,1)

Normally, the data for prediction models are inputted in chronological order as measured. The grey theory is aimed at predicting the trend in data, such as a series of annual maxims or annual averages of energy consumption in a multiannual time series. The grey models are not meant to predict the periodically changing data series. However, the theory can be used, for example, to find out the trend of the first Wednesdays in the consecutive years, then the second one, etc. The data series were pre-processed accordingly, i.e., a matrix of 53 rows and 9 columns was constructed. Each row represents the input for the self-standing grey model, which in the final gives 53 grey models for respective Wednesdays in the year. There are usually 52 Wednesdays a year and only occasionally it is 53, therefore in the 53rd row the missing values were duplicated by the values from

the 52nd row. In our case, there are just two years with 53 Wednesdays, one of which is situated in training data and one in the testing data.

#### 3.1.2. Data Pre-Processing for ANN

As explained above, our data concern only Wednesdays, because of the comparison with the official prediction. For this purpose, the data were pre-processed for the ANN model as follows. Except for the target values of the maximum hourly electricity consumption per day, the following variables were added:


#### *3.2. Grey Models*

The grey prediction is based on the grey systems theory. The methodology was created by Julong Deng in 1982, and it concentrates mainly on the study of problems concerning small samples and poor information [36]. The name "grey" was taken from control theory where the color shade has been usually used to specify the certain degree of information clarity. The adjective "grey" means that information is partially known and partially unknown. The methodology exhibits in systems with partially known information acquired via various means, such as generating, excavating, or extracting available information. Liu and Yang compared grey systems with other models dealing with some degree of uncertainty, such as stochastic probability, rough set theory, and fuzzy mathematics [37], and one can find in this study the discussion about the progress that grey systems theory has made in the world of learning and its wide-ranging applications in the entire spectrum of science.

#### 3.2.1. Grey Model GM(1,1)

The grey models have become very popular among scholars due to its simple principle and high forecast accuracy. The forecasting via grey models is based on the grey generating function GM(1,1) which utilizes the variation within the system in order to discover the relationships between sequential data. The prediction of the model is based on this relationship. The grey model incorporates a system of first-order differential equations to forecast a time series prediction. The algorithm of forecasting based on the GM(1,1) grey prediction model can be summarized as follows [36].


$$x^{(1)}(k) = \sum\_{i=1}^{k} x^{(0)}(i). \tag{1}$$

3. Compute the mean value of the first-order AGO sequence members

$$z^{(1)}(k) = 0.5 \left[ \mathbf{x}^{(1)}(k) + \mathbf{x}^{(1)}(k-1) \right], \ k = 2, 3, \dots, n. \tag{2}$$

4. Define the first-order differential equation for the sequence *X*(1) as

$$\frac{d\hat{X}^{(1)}}{dt} + a\,\hat{X}^{(1)} = b,\tag{3}$$

where *a* and *b* express the estimated parameters of the forecasting model.

5. Using the least squares estimate, it is possible to derive the estimated first-order AGO sequence

$$
\mathfrak{X}^{(1)}(k+1) = \left(\mathfrak{x}^{(0)}(1) - \frac{b}{a}\right) \mathfrak{e}^{-ak} + \frac{b}{a}.\tag{4}
$$

Parameters *a* and *b* can be calculated using the following (5)–(7).

$$
\begin{bmatrix} a \\ b \end{bmatrix} = \left(B^T B\right)^{-1} B^T Y\_{N\prime} \tag{5}
$$

$$B = \begin{bmatrix} -z^{(1)}(2) & 1 \\ -z^{(1)}(3) & 1 \\ \vdots & \vdots \\ -z^{(1)}(n) & 1 \end{bmatrix}' \tag{6}$$

$$\mathbf{Y}\_N = \left(\mathbf{x}^{(0)}(\mathbf{2}), \ \mathbf{x}^{(0)}(\mathbf{3}), \dots, \ \mathbf{x}^{(0)}(n)\right)^T. \tag{7}$$

6. The estimated members *x*ˆ<sup>0</sup> *<sup>k</sup>*+<sup>1</sup> can be obtained from the listed sequence by the inverse accumulated generating operation (IAGO)

$$
\mathfrak{X}^{(0)}(k+1) = \mathfrak{X}^{(1)}(k+1) - \mathfrak{X}^{(1)}(k). \tag{8}
$$

3.2.2. Nonlinear Grey Bernoulli model NGBM(1,1)

To adjust the traditional grey model to obtain the higher prediction precision, professor Chen [38] (Z Application of the novel nonlinear) firstly proposed the Nonlinear Grey Bernoulli Model (NGBM(1,1) model) which combines GM(1,1) with the Bernoulli differential equation. The difference of NGBM(1,1) lies in the greater curvature compared to the GM(1,1) solution.

The Bernoulli differential equation for the sequence *X*(1) is defined as follows:

$$\frac{d\hat{X}^{(1)}}{dt} + a\,\hat{X}^{(1)} = b\left[\hat{X}^{(1)}\right]^i, i \in \mathbb{R}.\tag{9}$$

If the exponent *i* was set to 0, the model would be equal to the GM(1,1) definition. Thus, the iterative approach with aspect to the minimal mean absolute percentage error (MAPE) should be applied to find the optimal exponent value *i*.

$$\mathfrak{X}^{(1)}(k+1) = \left[ \left( \left( \mathfrak{x}^{(0)}(1) \right)^{1-i} - \frac{b}{a} \right) e^{-a(1-i)k} + \frac{b}{a} \right]^{\frac{1}{1-i}}, \quad i \neq 1. \tag{10}$$

Parameters *a* and *b* can be calculated by following (5) with vector *YN* (7), but with different matrix *B*:

$$B = \begin{bmatrix} -z^{(1)}(2) & \begin{bmatrix} z^{(1)}(2) \end{bmatrix}^i \\ -z^{(1)}(3) & \begin{bmatrix} z^{(1)}(3) \end{bmatrix}^i \\ \vdots & \vdots \\ -z^{(1)}(n) & \begin{bmatrix} z^{(1)}(n) \end{bmatrix}^i \end{bmatrix}. \tag{11}$$

By performing IAGO, the predicted values *x*ˆ(0)(*k* + 1) can be calculated following (8).

3.2.3. Creating the Set of Grey Models

As mentioned above in the section concerning data pre-processing for GM models, the grey theory is used to find a trend in data. However, the result is a curve, which is almost straight in the case of GM(1,1). Therefore, it is not suitable for predicting periodically changing events, such as, daily energy consumption per year.

Electricity consumption is heavily affected by several factors, some of which were already mentioned. We observed that if annual electricity consumption is higher, it is usually dispersed within the year keeping the consumption pattern and the increment is dispersed proportionally. Concerning the maximal hourly consumption per day, the most affecting parameter is what kind of a day in the week it is and in what part of the year it is situated. Hence, the decision was made to create a set of transverse GM(1,1) models. Instead of predicting a standard time series, the first model of a set could forecast the consumption of the first Monday of a year based on data of the first Monday of previous years, etc. Therefore, the time series of data must be decomposed for this purpose and at the end results of the GM(1,1) set have to be recomposed to a regular time series. Provided that the annual consumption is rising, and the consumption pattern is preserved and not harmed by some unpredictable event, such as the pandemic lockdown, the forecast would disperse increased consumption among individual days effectively and vice versa.

Such a traverse set of grey models can be made for any day of the week or even for all days. However, due to the comparison with the official SEPS load consumption forecast, which publishes forecasts only for Wednesdays, we decided to construct the transverse grey model set for the same day of the week. Moreover, Wednesday is considered to be the day that best reflects the average consumption of a standard working day.

#### *3.3. Artificial Neural Network (ANN)*

An artificial neural network works similarly to a brain; it is a complex system of entities called neurons, with connections to other such entities. Each neuron somehow processes the incoming signal/s to an output signal/s. The interconnections, i.e., memory paths, consist of axons, dendrites, and synapses, and electrochemical media, the so-called neurotransmitters, are used for transmission. Some interconnections become stronger than others in the learning process. It is estimated that the human brain has around 100 billion interconnected neurons and each neuron may receive stimuli from as many as 10,000 other neurons. [39].

In the case of the ANN, artificial neurons act in the same way, but they are processing computer data, and their interconnections are called weights. Once the learning process and weight settings have been completed, ANN can be used to solve similar problems. Schematically, the whole layout of the neural network is usually divided into an input layer, a hidden layer/s, and a final output layer as is shown in Figure 4.

**Figure 4.** Shallow neural network.

3.3.1. Multi-Layer Feed-Forward Back-Propagation Network

A multi-layer feed-forward back-propagation network was designed via MATLAB's neural network fitting tool (nftool). Originally, nftool provided a shallow neural network, e.g., a network with just one hidden layer, that can be programmatically changed.

Nftool is the name of a tool that provides a graphic user interface to design, train, and validate a feedforward neural network to solve approximation problems. Default network settings of nftool are pre-set as follows:


Several changes have been made to the needs of this research and are as follows:

• The Quasi-Newton Backpropagation (BFGS) was used as a training method. The theory of quasi-Newton methods is based on the fact that an approximation to the curvature of nonlinear function can be computed without explicitly forming the Hessian Matrix. Newton's method is an alternative to the conjugate gradient methods for fast optimization. Backpropagation is used to calculate derivatives of performance with respect to the weight and bias variables. The variables are adjusted in the iteration process and a special parameter is estimated to minimize the performance along with the search direction *dX*. The line search function is used to locate the minimum point. The first search direction is the negative of the gradient of performance. In succeeding iterations, the search direction is computed according to the following formula:

$$dX = H^{-1} \lg X,\tag{12}$$

where *gX* is the gradient and *H* is an approximate Hessian matrix [41,42].


**Figure 5.** Validation process.

#### 3.3.2. Validation Process

The actual, used network adjusts its weights and biases matrixes when the training process is repeated without network reinitialization. While continuous re-education of the network may lead to a higher prediction accuracy, it can also lead to an overfitting problem. On the other hand, learning process abortion immediately after performance decrement may lead to premature finalization because of local minima.

Therefore, the validation process was set as follows. Each training process increases the counter by one. If the validation criterion value achieves a lower value in less than 6 consecutive runs, the counter is restarted and the training process continues with the current network setting; otherwise, the network is reinitialized to default weights and biases. The validation criterion value was set as the sum of equalized training and validation performance, i.e., the training SSE was multiplicated by 0.15 and the validation SSE was multiplied by 0.85. The maximum number of reinitializations was set to 10 resets apart from the reinitialization due to the inappropriate initialization. The inappropriate initialization may occur after network reset and the network may get stuck in the local minima. This behavior is associated with a large performance error. In such a case, the network was reinitialized without increment of the reset counter. In the end, the model with the lowest validation criterion value was chosen for the evaluation of model performance on the testing sample.

#### *3.4. Hybrid Model*

Examples of combining different models can be found in the literature, e.g., [24,26,43]. The main idea for the hybridization is to deduce if a hybrid model may lead to further improvement of prediction performance. Therefore, here the process is not standard, i.e., to find the appropriate weight distribution on the training sample and then to validate it on the testing sample, but iteration runs straight on the testing sample. The process serves as a starting point for future research. In this case, all proportion weight combinations of the predictions of ANN and GM(1,1) set were searched. The step for this iteration was set to 5%. The RMSE (12) and the MAPE (13) metrics were monitored throughout all the component combinations of the potential hybrid model.

#### *3.5. Performance Criteria*

In order to reflect the performance and the effectiveness of models, some indicators are adopted to evaluate the result. The definitions of the forest accuracy indicators are shown respectively, as follows:

(a) RMSE—Root Mean Squared Error

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (\mathbf{x}\_i - \mathbf{x}\_i)^2}. \tag{13}$$

(b) MAPE—Mean Absolute Percentage Error

$$MAPE = \frac{1}{n} \sum\_{i=1}^{n} \left| \frac{\mathbf{x}\_i - \mathbf{\hat{x}}\_i}{\mathbf{x}\_i} \right| 100\text{\%}.\tag{14}$$

#### (c) RMSPE—Root Mean Squared Percentage Error

$$RMSPE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left(\frac{\mathbf{x}\_i - \mathbf{x}\_i}{\mathbf{x}\_i}\right)^2} 100\%. \tag{15}$$

In the formulations, the *xi* is the real value and *x*ˆ*<sup>i</sup>* is the predicted value, and *n* is the number of samples. Lower forecast indicators indicate more satisfactory predictive ability and higher accuracy.

#### **4. Results**

The results are divided into three parts, namely the prediction of grey models, the results of artificial neural networks, and finally the potential of hybridization. The results were maintained separately for the year 2019, which was not affected by pandemic lockdown, and the year 2020, which was affected in all provided results. However, in the part concerning the search of hybridization proportion, just the 2019 results were considered. All the prediction charts also contain the official SEPS prediction due to easier assessment of the prediction performance provided by the respective models. As a reminder, all provided models serve as two-year-ahead forecasting models and the official SEPS load consumption predictions are one-year-ahead predictions.

#### *4.1. Grey Models Prediction*

At first, the prediction of the model using the set of standard grey models is provided. Figure 6 shows good results and supports the used methodology to construct the GM(1,1) set for prediction of periodic trend. Even the two-year-ahead prediction (for 2020) shows comparable or better results with the official one-year-ahead forecast and supports the viability of the proposed methodology.

**Figure 6.** Prediction of Grey Model set. (**a**) Prediction of GM(1,1) set for 2019, (**b**) Prediction of GM(1,1) set for 2020.

Figure 7 shows the results for the NGBM(1,1) set. As can be observed in the charts, while one-year-ahead prediction achieved good results (but was worse than the GM(1,1) set), the two-year-ahead forecast falls considerably behind even the SEPS forecast. The main advantage of better curvature seems to be ineffective in such a transverse approach; thus, the NGBM(1,1) was not used for hybridization.

**Figure 7.** Prediction of Nonlinear Grey Bernoulli Model set. (**a**) Prediction of NGBM(1,1) set for 2019, (**b**) Prediction of NGBM (1,1) set for 2020.

*4.2. Neural Network Prediction*

4.2.1. Hidden Layers Composition

During our search for the optimal hidden layers layout, various combinations of the following numbers resulting from data composition were inspected:


Throughout the process, only the number of neurons and layers were changed, but not the layer types. Shallow NN with just one hidden layer composed of 3 or 5, ... or 53 and even 73 neurons were very fast, but the results were unsatisfying. The shallow NN with 9540 (53 × 12 × 5 × 3) neurons gave a satisfactory result, but the time consumption for training was enormous. Interim results showed that more layers with a lower number of neurons gave comparable or even better results than the one shallow NN with a huge hidden layer. The composition of 53, 12, 5, and 3 neurons in 4 hidden layers achieved a satisfactory result. When the number of neurons was increased by multiplication with various integer multiplicators, the results were of comparable quality, but the learning time was rising significantly. Therefore, the decision on the minimal satisfactory composition was made as follows.

The final layout of the hidden layers was divided into 4 hidden layers with 53, 12, 5, and 3 neurons in the listed order, as depicted in Figure 8.

**Figure 8.** Neural network layout.

#### 4.2.2. Validation Process

Data from the validation process can be seen in Figure 9. As it can be observed, all iterations brought relatively similar results in terms of performance measured via sum squared error. Probably, fewer resets were needed, but the figure shows another way of future research in creating a set of shallow NN components via bagging, boosting, or stacking to further improve the prediction power of ANN.

**Figure 9.** Validation process log.

Predictions achieved by the ANN are depicted in Figure 10. It is obvious, especially in one-year-ahead predictions, that the results are the best among provided models and the weight of this component would play the main role in the potential hybrid model.

**Figure 10.** Prediction of ANN. (**a**) Prediction of ANN for 2019, (**b**) Prediction of ANN for 2020.

#### *4.3. Hybridization*

A comparison of the prediction performance metrics of the potential hybrid model is provided in this part of the results. The objective here is not to find the best score weight distribution of individual elements in the optimization or validation process, but to find out if such a process may lead to improvement. Deducing what period to take into consideration in the optimization process is a theme for future research. However, the weight distribution can be optimized via nonlinear programming with the objective function, Equation (16):

$$\text{minMAP} = \frac{1}{n} \sum\_{i=1}^{n} \left| \frac{\mathbf{x}\_{i} - (w\_{A}\hat{\mathbf{x}}\_{iA} + w\_{G}\hat{\mathbf{x}}\_{iG})}{\mathbf{x}\_{i}} \right| \cdot 100\% \tag{16}$$

and the constrains of Equations (17) and (18):

$$w\_A + w\_G = 1\tag{17}$$

$$w\_{A\prime}w\_{G} \geq 0 \tag{18}$$

where *wA*, *wG* are wanted weights of particular models and *x*ˆ*iA*, *x*ˆ*iG* are estimated values by these models.

Nevertheless, Figure 11 shows a comparison of RMSE and MAPE metrics on different weight distributions based solely on the testing sample, i.e., predictions for 2019. The optimal point, in this case, is situated between near 85% weight for the ANN and 15% for the GM(1,1). We assume that the closer the prediction power is, the better the results could be achieved by hybridization. Figure 11 shows that the use of two components bring small but measurable improvements. We presume that the inclusion of another similarly performing method could potentially further improve the results.

#### **5. Discussion**

There are many works incorporating grey models to forecast annual energy consumption, e.g., [5,7,9–13], but we have not found works concerning daily based forecasts via grey models. As far as we know, more sophisticated methods, such as random forests, NN, or SVM are used to forecast energy consumption, similar to [44]. Our experimentation is focused on the prediction of maximum hourly electricity consumption per day in Slovakia by two Grey model types (GM(1,1), NGBM(1,1)) as a transverse set and one ANN. Moreover, the fourth potential hybrid model, as an example of the future path of our research, was presented in the manuscript.

A traverse set of grey models or ANN can be constructed for any day of the week or for all days. However, due to the comparison with the official SEPS load consumption forecast, which publishes forecasts only for Wednesdays, the models and extracted data are focused on the same day. Wednesday is considered to be the day with the average working day consumption. In our study, all three mentioned models used the same sample of acquired data (maximum hourly electricity consumption per each Wednesday from 2010 to 2018) for training. Equally, the same sample of acquired data was used for testing, i.e., for comparison of the real data with predicted ones (maximum hourly electricity consumption per each Wednesday in years 2019 and 2020). Though, the whole training dataset cannot be denoted as equal for grey models and ANN due to the different data pre-processing as described in Section 3.1. However, these differences do not prevent relevant comparison of model performance.

Although we have no information about the indicators or method used in the official prediction (marked as SEPS), the predicted data are also involved in our comparison for the relevant purpose of our findings.

The prediction performance of our results altogether with official prediction is shown in Table 1.

As can be seen from Table 1, all our models performed better than SEPS in 2019. From the grey models, the GM(1,1) set is more accurate than the NGBM(1,1) set. The ANN model outperformed the official SEPS predictions as well as the grey model forecasts in both 2019 and 2020. This is probably caused by additional information added in the data pre-processing phase. Namely, it is information of whether the public holiday falls on a specific Wednesday or not. The importance of public holiday information can be observed in Figure 10, where the steep decrease can be monitored in the ANN prediction. There is a public holiday in Slovakia on the 1st and 8th of May (18th and 19th Wednesday in 2019) and maximal electricity consumption was reasonably lower as surrounding points in the chart. The only model that reflects these circumstances was the ANN. On the other hand, the 53rd Wednesday in 2020 (30 December 2020) is not a public holiday and the network predicted a higher value than it should.


**Table 1.** Prediction performance comparison.

As mentioned before, there is various research that uses various types of grey models to forecast whole year consumption, but we are not aware of any using them as a transverse set for daily forecasting. Nevertheless, our results are of comparable quality in the terms of MAPE metrics despite the different usage. For example, the results of the improved grey models achieved 2.78% to 3.10% in consumption of the APEC countries in [5]. The results of backpropagation neural network in the mentioned research achieved 5.10% and 9.97% for support vector machine regression. Another research of this type was used to forecast energy consumption in China and India and compares 5 types of the grey models with various results ranging from 3.98% to 14.79% for Beijing consumption and from 3.46% to 35.02% for India in [9].

In the case of daily based forecasting of energy consumption in [44], more demanding methods compared to grey models were used. The results of the case study for two companies achieved from 8.75% to 19.23% for one company and from 4.45% to 6.07% for the second case. The Functional Principal Component Analysis was used to decompose the electric consumption patterns in [45]. The method was used to investigate and predict consumption patterns in the Milan metropolitan area on special contractual characteristic groups with MAPE ranging from 7.63% to 39.19%. Forecasting the daily night peak electric power demand of the Sri Lankan power system by using past daily data in time series analysis was used in [46]. The prediction model based on ARIMA achieved a monthahead forecast with MAPE of 4.195% and a week-ahead MAPE of 1.855%. Modelling and forecasting hourly electricity demand in West African countries by using multiple regression analysis was introduced in [47] with an average relative error of less than 15%, except for Sierra Leone, which had a relative error of 38.5% due to civil war in the country. Research with a very different approach to forecasting daily electricity consumption can be found in [48]. The meteorological factors played a major role in the one-week-ahead prediction, and the relative errors of their models ranged from 2.36% to 10.40%.

All predictions for the year 2020 show worse results in comparison with the year 2019. This is because of the pandemic lockdown in spring 2020. However, the prediction of the NGBM(1,1) set starts to fall behind even the SEPS prediction and thus it seems to not be the right component in the hybridization, or at least with the used dataset. All possible combinations of the component weights with the step of 5% were checked to find out the potential of hybridization. This approach shows the potential to further improve the prediction power, where one component may improve the weaknesses of the other one. Finding out the optimum in an appropriate way and searching for suitable components are the objective for future research. However, Figure 12 is provided to show the comparison of predictions in the test period from January 2019 to December 2020, i.e., evaluation of the prediction power for the hybrid model with proportion 85/15 (ANN to GM(1,1)).

**Figure 12.** Comparison of predictions for the years 2019 and 2020.

We identify several limitations of our study. The difficulty was in acquiring the data of the official system load prediction provided by SEPS [35]. Electric energy consumption is influenced by many factors, as the impact of weather fluctuations and economic decisions (lockdown during COVID-19) are hard to consider in prediction. One way to overcome this obstacle is to focus on interval prediction instead of point prediction. Another possibility is to improve the prediction accuracy by looking for another component in a hybrid model, finding the optimal weight proportion, or stacking or bagging shallow neural networks.

#### **6. Conclusions**

Electrical energy consumption forecasting is important for planning and facility expansion in the electric industry. Accurate forecasts can save operating and maintenance costs, increase the reliability of power supply and delivery systems, and correct decisions for future development. Forecasting is also important for the sustainable development of the electric power industry. Much presented research and also our results show that is very problematic to predict an exact electricity consumption pattern. The reasons lie in the substantial impact of various factors, such as weather conditions, economic situation, population growth, pandemic outbreak, etc.

The main goal of the article was to offer more accurate models to predict electrical energy consumption in Slovakia than officially provided. The contributions of the proposed article can be summarized as follows:


**Author Contributions:** Conceptualization: M.V., M.P. and O.B.; methodology: M.V. and M.P.; software: M.P.; validation: O.B.; formal analysis: M.P. and O.B.; investigation: O.B.; resources: M.V. and M.P.; data curation: O.B. and M.P.; writing—original draft preparation: M.P. and M.V.; writing—review and editing: M.V. and O.B.; visualization: M.P. and O.B.; supervision: M.V. and M.P.; project administration: M.V. and. M.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Publicly available datasets were analyzed in this study. This data can be found here: [1–4].

**Acknowledgments:** This research has been supported by Slovak Grant Agency for Science (VEGA), Grant No. 1/0157/21.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

