**1. Introduction**

South Africa has been seen to be a late participant in the three key industrial revolutions [1]. The use of artificial intelligence (AI) and data is on the rise in South Africa [2–4]. This rise means that South Africa might not be a late participant in the fourth industrial revolution. In 2007, 2013, 2018, and 2019, South Africa experienced a shortage in power supply due to various challenges, leading to load shedding [1]. South Africa's public power utility, Eskom, has on several occasions stated its inability to accurately predict/forecast the unplanned capability loss factor (UCLF) as one of the major factors leading to an unreliable power supply and unpredictable load shedding [5,6]. UCLF is a term that refers to the measure of unplanned plant breakdown. The behavior of South African UCLF has not been well studied. Pretorius et al. studied the impact of the South African energy crisis on emissions [7]. This study only talks about an increase in UCLF due to maintenance deferral. The study does not talk about how to forecast UCLF, nor the major factors that contribute to UCLF that can help in the forecasting of UCLF. The UCLF, planned capability loss factor (PCLF), and other capability loss factor (OCLF), together with the installed capacity, determine the power available to supply customers. The PCLF is the planned plant

**Citation:** Motepe, S.; Hasan, A.N.; Shongwe, T. Forecasting the Total South African Unplanned Capability Loss Factor Using an Ensemble of Deep Learning Techniques. *Energies* **2022**, *15*, 2546. https://doi.org/ 10.3390/en15072546

Academic Editors: Luis Hernández-Callejo, Sergio Nesmachnow and Sara Gallardo Saavedra

Received: 31 January 2022 Accepted: 7 March 2022 Published: 31 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

outages for the maintenance or refurbishments of the plant. This is typically a planned, set value set by the utility. The utility can decide to change their planned outage/PCLF depending on different factors. The OCLF accounts for other or random losses and is usually significantly smaller than the UCLF [8]. The installed capacity gives the number of megawatts of the installed power plant units. Micali studied the prediction of new coal power plants' availability in the absence of data in South Africa [8]. The author mentions that the work is a precursor to predicting UCLF in new plants. The author proposes using expert opinion with some data from stations where data are available. However, the work in [8] did not focus on the total UCLF, assumed limited availability of data, did not use AI techniques, and depended on expert knowledge. In [9], the authors state that expert knowledge can change from one expert to the next, and thus expert results can be different from the same data. The author, in addition, did not investigate factors that affect power supply and may influence the UCLF [8]. There is, thus, a gap in South Africa in terms of accurately forecasting UCLF. In addition, the study of the total South African UCLF behavior is a gap as only precursor work exists, and the precursor work is focused on new plants. Another gap is the use of intelligent systems that are not reliant on human experts in UCLF forecasting.

To add to the previous paragraph, the knowledge of when the power system might experience a power shortage is still a topic of interest and is not only important for the utility, but also customers. Knowing when there may be a power shortage, and hence a requirement to reduce consumption, helps customers plan their operations. Unplanned failures have been studied before. In [10], real-time prediction of distribution system outage duration using historical outage records to train neural networks was studied. The Netherlands collects information on unplanned outages from its utilities to inform its maintenance and investment policies [11].

South Africa is the highest producer of electricity in Africa and is in the top 25 producers of power in the world [12,13]. Over 80% of South Africa's power is produced by coal-fired power stations and a nuclear power station. The total South African power grid UCLF can, thus, be modeled as that of the coal and nuclear power stations. Despite the recent move towards cleaner energy, the largest power-producing countries, such as India and China, still rely heavily on coal-fired power stations [12]. The study of coal thermal power plants and behavior is, thus, still of interest [14–17]. The study of the South African coal-fired power station UCLF is, therefore, important as coal power plants are still highly used and are still a research topic of interest.

Forecasting and prediction have been topics of interest for many researchers [10,18]. This is mainly due to an interest in understanding and predicting the future behavior of certain variables. Artificial intelligence (AI) techniques have become popular in these forecasting/prediction tasks. One of the reasons for this popularity is their ability to model non-linearity with high accuracy. Khoza and Marwala used an ensemble of the multi-layer perceptron and rough set theory to predict the direction that the South African gross domestic product (GDP) would take [18]. Galius proposed a probabilistic model for modeling power distribution network blackouts [19]. In Egypt, power cable failures were analyzed to help prevent future power outages [20]. In [21], bilateral long short-term memory (LSTM) was used to forecast the short-term cycle of wafer lots for the planning and control of wafer manufacturing. The rise of computational power and access to labeled data has led to an increase in the utilization of deep learning techniques [22]. Deep learning techniques have been seen to have an excellent performance in multiple areas, such as language and speech processing, as well as computer vision [23,24]. Alhussein et al. used a hybrid of convolutional neural networks (CNN) and long short-term memory (LSTM) to forecast individual house loads [25]. Here, the researchers use CNN to select features from the input data and LSTM to learn the sequence. The authors stated a mean absolute percentage error (MAPE) improvement greater than 4% in comparison to LSTM-based models. Kong et al. also combined CNN and LSTM for short-term load forecasting in Singapore [26]. Pandit et al. compared LSTM and Markov chain models in weather

forecasting for German offshore wind farms to improve their wind turbine availability and maintenance [27]. Deep learning has also been used to forecast wind speeds at turbine locations [28]. The authors combine CNN and the gated recurrent unit (GRU) to achieve satisfactory results in comparison to existing models. Deep learning techniques have also been used to forecast the Korean postal delivery service demand [29]. This observed performance of deep learning techniques has also led to their adoption in recent load forecasting studies [30,31]. A gap still exists in the application of the state-of-the-art techniques in forecasting UCLF (and South African UCLF), as applied in forecasting in the different engineering areas.

As observed, a number of studies have used a combination of techniques to achieve improved performance [25–29]. This combination of techniques is usually termed ensemble or hybrid techniques. Ensemble techniques have also been used for classification in different engineering applications. Ramotsoela et al. used an ensemble of five artificial intelligence techniques to detect intrusion in water distribution systems [32]. The ensemble model used here combined an artificial neural network (ANN), RNN (recurrent neural network), LSTM, GRU, and CNN in a voting system. The ensemble model classified its output as an anomaly if at least two constituent models classified their outputs as an anomaly. CNN models have been combined to determine driver behavior from multiple data streams [33]. The proposed ensemble model incorporated a voting system to enhance the classification accuracy. A double ensemble model of semi-supervised gated stacked auto-encoders has been used to predict industrial key performance indicators [34]. Drif et al. proposed an ensemble of auto-encoders for recommendations [35]. The authors used an aggression method to combine outputs from the sub-models to form the ensemble model output. Bibi et al. used an ensemble-based technique to forecast electricity spot prices in the Italian electricity market [36]. The authors estimated deterministic components using semi-parametric techniques and then determined stochastic components using time series, and machine learning algorithms. The final forecast is obtained from the estimates of both components [36]. Shah et al. used a similar approach to Bibi et al. in short-term electricity demand forecasting for the Nordic electricity market [37]. The similarity is that the authors separated their approach into a deterministic and a stochastic component and then combined the estimates from them to obtain the final forecast. None of the literature covers the use of ensemble techniques in forecasting UCLF. The use of ensemble techniques in UCLF forecasting is, thus, an existing research gap.

This paper introduces the following contributions: (i) A novel study of the South African UCLF behavior using state-of-the-art AI (deep learning and ensemble) techniques. (ii) An investigation of the impact of the installed capacity, historic demand, and PCLF on the UCLF forecasting accuracy. (iii) An introduction of a novel deep-learning ensemble total South African UCLF forecasting system.

The remainder of this paper is arranged as follows: Section 2 presents the techniques used in this research. Section 3 presents the experimental setup. The proposed UCLF forecasting system is presented in Section 4. Section 5 then presents the experimental results and the discussion of the results. The paper conclusions are presented in Section 6. Section 7 presents the limitations of the study as well as future work. The paper flow chart is shown in Figure 1.

**Figure 1.** The paper arrangement flow chart.

#### **2. Methods Used**

This section presents the four techniques used in this research.
