Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models

Aksan, Fachrizal; Suresh, Vishnu; Janik, Przemysław; Sikorski, Tomasz

doi:10.3390/en16145381

Open AccessArticle

Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models

Faculty of Electrical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland

^*

Author to whom correspondence should be addressed.

Energies 2023, 16(14), 5381; https://doi.org/10.3390/en16145381

Submission received: 17 May 2023 / Revised: 10 July 2023 / Accepted: 12 July 2023 / Published: 14 July 2023

(This article belongs to the Special Issue Digital Twins in Power Electronics)

Download

Browse Figures

Versions Notes

Abstract

:

Electric load forecasting is crucial for the metallurgy industry because it enables effective resource allocation, production scheduling, and optimized energy management. To achieve an accurate load forecasting, it is essential to develop an efficient approach. In this study, we considered the time factor of univariate time-series data to implement various deep learning models for predicting the load one hour ahead under different conditions (seasonal and daily variations). The goal was to identify the most suitable model for each specific condition. In this study, two hybrid deep learning models were proposed. The first model combines variational mode decomposition (VMD) with a convolutional neural network (CNN) and gated recurrent unit (GRU). The second model incorporates VMD with a CNN and long short-term memory (LSTM). The proposed models outperformed the baseline models. The VMD–CNN–LSTM performed well for seasonal conditions, with an average RMSE of 12.215 kW, MAE of 9.543 kW, and MAPE of 0.095%. Meanwhile, the VMD–CNN–GRU performed well for daily variations, with an average RMSE value of 11.595 kW, MAE of 9.092 kW, and MAPE of 0.079%. The findings support the practical application of the proposed models for electrical load forecasting in diverse scenarios, especially concerning seasonal and daily variations.

Keywords:

deep learning models; short-term electric load forecasting; time factor; variational mode decomposition

1. Introduction

1.1. Background

The rise in population growth and economic development has resulted in a significant increase in electricity consumption in both household and commercial sectors. For instance, in Poland, end-user energy consumption steadily increased by an average of 1% annually between 2015 and 2019, leading to an overall increase in electricity supplied to end users by approximately 6.8%, or 8.64 TWh [1]. Thus, electric utilities must maintain power grid stability by providing an adequate source of electricity to meet load demand, which fluctuates over time. This necessitates electric utilities to estimate the current and future power demand to achieve a balance between supply and demand, enabling the generated power to meet the power system reliability criteria. Load forecasting, which involves estimating current and future load demand, is essential for power system operation and planning, enabling authorities to make crucial decisions regarding load changes, power generation planning, power transaction planning, and infrastructure development [2].

The energy intensity exhibited by companies in Poland, encompassing both the industrial and service sectors, is comparatively elevated within the European Union [3]. Increasing electricity prices in the commercial sector poses a significant challenge. Consequently, companies are compelled to improve energy efficiency and explore alternative solutions to manage energy demand and prices. Despite the potential economic benefits and reduction in carbon emissions, only a few end-users in the industrial and service sectors have adopted renewable energy sources, battery storage, and smart technologies for buildings and power grids to address these challenges. However, incorporating these new applications can lead to fluctuations and randomness in load demand owing to the intermittent nature of renewable energy sources, potentially causing significant changes in the electricity source preferences of end-users, ultimately impacting the balance between energy supply and demand. Therefore, a reliable and accurate load forecasting is essential for addressing energy balance issues. Another challenge of load forecasting is related to the privatization system in the electricity market of some countries, where energy consumers are free to choose their suppliers among several operators [4], leading to different electricity prices and load curves that are poorly correlated with other variables, making load forecasting a crucial aspect to address these issues.

Electric load forecasting is a commonly used technique for predicting future electric loads based on historical loads and other relevant information. This process involves forecasting for a specific period, known as the forecast horizon, which determines how far in advance the forecast is made. The forecast horizon can be classified into four categories based on the time step and typical application [5,6]: very short-term, short-term, medium-term, and long-term. The prediction of the electric load for periods ranging from five minutes to one hour is considered a very short-term prediction, which is primarily used for real-time monitoring. An accurate load prediction from one to several hours in the future is classified as a short-term horizon and is essential for load balancing and power exchange [4]. Load prediction for larger time horizons, such as one week or several weeks, is categorized as a medium-term or long-term horizon, which is often utilized for maintenance planning and operation management.

1.2. Related Work and Contribution

Several approaches are available for load forecasting, and these methods differ depending on whether they are used for short-, medium-, or long-term forecasts. The primary contrast between medium and long-term forecasting and short-term forecasting is that the former involves external factors such as population growth projections, economic changes, and technological developments. End-use and econometric modelling are two well-known techniques employed for medium- to long-term predictions [2]. In contrast, conventional statistical methods [1], such as the autoregressive integrated moving average (ARIMA) [7] and artificial intelligence methods [8,9], such as machine learning [10] and deep learning (DL) [11,12] are some of the methods that have been developed for short-term load forecasting.

A study in [13] compared the performance of short-term forecasting models based on statistical and deep learning methods. Researchers have determined that both methods have distinct advantages and drawbacks. The advantages associated with deep learning methods include a lower error rate, greater flexibility in addressing more intricate problems, and the ability to learn from data using computing power [14]. DL is a highly versatile approach that can be implemented effectively across a range of prediction tasks. However, the challenge in using the DL method is to determine the optimal configuration for the structure of the models. In contrast, traditional statistical methods can provide accurate prediction results and establish relationships between variables. However, this method primarily relies on traditional data analysis and may present challenges in its application to large and nonlinear datasets.

Deep learning is a well known method used for load forecasting, notable examples of popular deep learning methods employed for load forecasting encompass multilayer perceptron (MLP), which is shown in references [15,16], recurrent neural networks (RNN) [17,18], long short-term memory (LSTM) [19,20], gated recurrent unit (GRU) [21,22], convolutional neural network (CNN) [23,24], and hybrid deep learning models [25,26].

According to [2], the short-term load forecast can be affected by several key factors, such as time, meteorological conditions, and potential end-use categories. The time factor is important to distinguish the pattern of electricity consumption under different conditions. For instance, the consumption of electricity during the summer and winter exhibited discernible discrepancies. Furthermore, there are significant disparities in the electricity consumption patterns on weekdays and weekends. Therefore, this time factor is very informative for load forecasting analysis by utilizing the time index of the data, such as the day of the week, hour of the day, and minute of the hour. Moreover, weather conditions also play an important role in influencing load demand characteristics. Among the various parameters, temperature and humidity have the most significant influence on the prognosis of the load demand. The last crucial key factor is customer class. Electricity consumers are typically categorized into three groups: Residential, Commercial, and Industrial. While electricity consumption patterns vary considerably among different classes, they exhibit a degree of similarity within each category [2]. Therefore, most utilities differentiate load patterns by class. These key factors play a critical role in determining the future load demand characteristics. Therefore, it is essential to consider these factors for accurate load forecasting. Nonetheless, it is worth noting that the development of load forecasting models is not necessarily restricted to incorporating all these key factors.

The present investigation focuses exclusively on a univariate time-series dataset that comprises historical variables related to electricity consumption. Notably, the relevant literature provides examples of studies that solely employ historical data of power demand or electricity load datasets for the development of load forecasting models. For instance, [15] presented an investigation wherein the authors exclusively utilized historical electric load datasets to devise a long-term hybrid model based on MLP and statistical techniques. Similarly, [18] demonstrated a study that employed only historical load usage data to develop short-term load forecasting using an RNN. Reference [7] presented a study on the development of short-term load forecasting using the ARIMA and ANN approaches. In this study, the author utilized real load electricity profiles were used as historical datasets. To achieve accurate load demand forecasting, it is imperative to consider the specific crucial factors. Consequently, this study places a primary emphasis on the time component as the principal element for load forecasting. By extracting the time index from the dataset, comprehensive information concerning the minutes, hours, days, weeks, months, and years associated with each individual data point can be obtained. This time index facilitates convenient categorization of data points based on comparable temporal conditions.

Based on this insightful perspective, it is evident that DL methods hold substantial value in effectively addressing load forecasting challenges by utilizing historical electric load datasets. However, the current study identifies a significant research gap concerning the limited exploration of DL models, specifically for hour-ahead electricity load forecasting in the context of the metallurgy industry. Although previous studies have established a foundation for the effectiveness of DL models in electric load forecasting, there remains a scarcity of comparative studies that specifically focus on developing load forecasting models tailored to the unique requirements of the metallurgy industry. Furthermore, the influence of diverse conditions on the performance of deep learning models in this industry has not been adequately investigated. Consequently, the present study aims to investigate the hypothesis that there exists a significant difference in the accuracy of hour-ahead electric load forecasting among various deep learning models when subjected to different conditions within the metallurgy industry.

The main contribution of this study is the implementation of several deep learning models to predict the one-hour-ahead electricity load under different conditions, while also identifying the appropriate individual DL model for each specific condition. These conditions are divided into two broad categories: seasonal and day category. Seasonal conditions encompass winter, spring, summer, and autumn. The day category was subdivided into weekdays (Monday, Tuesday, Wednesday, Thursday, and Friday) and weekends (Saturday and Sunday). The proposed DL models for load forecasting evaluated in this study are hybrid models of variational mode decomposition (VMD) with CNN–LSTM and VMD–CNN–GRU. This model was compared with the baseline models, including MLP, LSTM, GRU, CNN, hybrid CNN–LSTM, and hybrid CNN–GRU. The main contributions of this study are summarized as follows.

The inclusion of the time factor as a crucial component that influences load forecasting, which allows for the differentiation of each data point and the formation of numerous sub-datasets based on various conditions.
The implementation of several deep learning models for short-term load forecasting for one hour ahead, considering different conditions.
The comparison and assessment of the performance of the deep learning models for load forecasting across diverse sub-datasets.

The paper is structured in the following manner: Section 2 provides a brief overview of deep learning models, Section 3 outlines the proposed methodology, Section 4 presents dataset description, Section 5 present results and discussion, and finally, Section 6 provides the paper’s conclusion.

2. Deep Learning Description

2.1. Multilayer Perceptron (MLP)

The MLP, a feedforward neural network, generates outputs from inputs by learning the complex relationship between linear and non-linear data patterns [27]. The MLP is a deep learning model characterized by a hierarchical structure comprising a minimum of three node layers, namely the input layer, hidden layers, and output layer [28,29]. These layers are structured and interconnected, as shown in Figure 1. Each layer in the MLP model has at least one node. This node is a neuron or a perceptron that has a nonlinear activation function to perform computation, except for the nodes in the input layers [28]. The computation in the nodes is done by the total multiplication of the weighting value (w) and the input value (x) from the previous layer plus the bias value (b). The transfer function (f) is used to produce an output (y) in some way. The mathematical formula of the calculation can be represented as follows in Equation (1):

y = f (\sum_{j = 1}^{n} w_{j} x_{j} + b) = f (w^{T} x + b)

(1)

In the learning phase, the MLP model employs the backpropagation technique to flexibly modify the connection weight values [30]. This modification is based on the error rate, which is computed by contrasting the anticipated output value with the real output value obtained from processing each dataset. This procedure is iteratively performed until the MLP model reaches an optimized error rate, thereby enabling the MLP model to function with greater precision and reliability.

The MLP model is also a well-known model for load forecasting, as discussed in reference [15]. This model combines statistical techniques with machine learning methods to address the potential issue of average convergence when solely relying on machine learning for mid/long-term load forecasting. Similarly, another study discussed in reference [16] demonstrates the proficiency of the MLP model in accurately predicting electrical power demand usage.

2.2. Recurrent Neural Network (RNN)

The recurrent neural network (RNN) represents a distinct variation of the deep learning (DL) model, which is considered unique when compared to other variants [31]. The reason for its uniqueness lies in the RNN layer’s special ability to model short-term dependencies with a hidden state [32]. This hidden state serves as a storage unit, acting as a highway that passes information from one time step to another within the unrolled RNN units [31]. The structure of these unrolled RNN units is depicted in Figure 2. This model also has capabilities for load forecasting applications. In reference [17], the authors propose the Online Adaptive RNN as a load forecasting approach that demonstrates the ability to continually learn from new data and adapt to evolving patterns. Additionally, reference [18] focuses on the application of RNN models in predicting electrical load to maintain a balance between demand and supply.

In the unrolled RNN unit, a hidden state at the current time step (h^(t)) is determined by the value of the previous hidden state (h^(t−1)) and the current input (x^(t)). This process provides a memory function to retain the information of the previous time step while processing the information of the current time step. Therefore, the output at the current time step (o^(t)) of the RNN is constantly dependent on the previous elements in the sequence. All connections between the input, the hidden state, and the output of the unrolled RNN unit have weights (w) and biases (b) at all time steps. The formula for calculating the hidden state value (refer to Equation (2)) and the output value (refer to Equation (3)) at current time step is as follows:

h^{(t)} = f (h^{(t - 1)} \cdot h_{w} + x^{(t)} \cdot x_{w} + h_{b})

(2)

O^{(t)} = f (h^{(t)} \cdot o_{w} + o_{b})

(3)

Basically, the RNN is a simple and powerful model. However, it has a drawback associated with the exploding and vanishing gradient when backpropagation over time is used. The problem is that standard RNNs have difficulty capturing long-term dependencies because multiplicative gradients can decrease or increase exponentially with the number of layers. To address these issues, a different family of RNNs can be used. Examples include long short-term memory (LSTM) and gated recurrent unit (GRU). These models are an extension of regular RNNs that allow dealing with long-term dependencies and storing information for longer periods of time without exploding or vanishing gradients [33].

In reference [34], the LSTM model was introduced to solve the vanishing gradient problem by including memory cells and gates that regulate the information of the network. A typical LSTM consists of memory blocks called cells, which have two states: the cell state and the hidden state. The cells are used to make decisions by storing or ignoring information through three primary gates. These gates include a forget gate, an input gate, and an output gate, as shown in Figure 3a. The LSTM network operates in three steps: In the first step, the network uses the forget gate to determine what information to ignore or storing for the cell state by calculating the input at the current time step (x_t) and the previous value of the hidden state (h_(t−1)) using the sigmoid function (S). The computation in forget gate is presented as follows in Equation (4):

f_{t} = S (w f \cdot [h_{(t - 1)}, x_{t}] + b_{f})

(4)

In the second step, the network decides to update the old cell state (C_(t−1)) into a new cell state (C_t) by selecting which new information to include in long-term memory (cell state). This process requires reference values from the forget gate, the input gate (see Equation (5)), and the cell update gate (see Equation (6)). The formulas for this step are as follows in Equation (7):

i_{t} = S (w i \cdot [h_{(t - 1)}, x_{t}] + b_{i})

(5)

{C^{'}}_{t} = T (w c \cdot [h_{(t - 1)}, x_{t}] + b_{c})

(6)

C_{t} = (C_{(t - 1)} \cdot f_{t}) + (i_{t} \cdot {C^{'}}_{t})

(7)

When the update process for new cell state (C_t) is complete, the next step is to define the value of new hidden state (h_(t)). This state acts as a memory of the network that contains information about previous data. Additionally, it can be used as an output for prediction. In this step, the value of the new cell state and the output gate are used. The formula is as follows in Equations (8) and (9):

o_{t} = S (w o \cdot [h_{(t - 1)}, x_{t}] + b_{o})

(8)

h_{t} = o_{t} \cdot T (C_{t})

(9)

Another variant that is simpler than LSTM is the GRU model. In this model, the cell state is removed, and the hidden state is used to transmit information. The GRU model has only two gates: the update gate

(r_{t})

and the reset gate

(z_{t})

, which are shown in Figure 3b. The update gate was formed by merging the input gate and the forget gate into one gate. This gate works similarly to the forget and input gate of the LSTM, which decides whether to add or ignore useful information [35]. While the reset gate is used to decide how much information to remove from the past. The value of the reset gate (refer to Equation (11)) and the update gate (see Equation (10)) determines the new hidden state as the output of the network. The formula for the new hidden state (h_(t)) is shown in Equation (13):

z_{t} = S (w z \cdot [h_{(t - 1)}, x_{t}])

(10)

r_{t} = S (w r \cdot [h_{(t - 1)}, x_{t}])

(11)

{h^{'}}_{t} = T (w \cdot [r_{t} \cdot h_{(t - 1)}, x_{t}])

(12)

h_{t} = T ((1 - z_{t}) \cdot h_{t - 1}) + (z_{t} \cdot {h^{'}}_{t})

(13)

LSTM and GRU models possess the capacity to address load forecasting challenges. In reference [20], the authors introduce an LSTM model with the prophet model, aiming to overcome the aforementioned limitations and achieve accurate load prediction. On the other hand, reference [21] presents GRU as an adaptive approach that focuses on targeted design to capture variable temporal dependence and incorporates both periodic and nonlinear characteristics of load forecasting problems.

2.3. Convolutional Neural Network (CNN)

The convolutional neural network (CNN) is one of the DL models mainly used for image processing analysis and pattern recognition [36]. This is due to the network is able to learn highly abstracted features of objects, such as spatial data [37]. However, the CNN can also be used for time series prediction since it has the capability to automatically learn the features of sequence data involving multiple variables. Typically, a CNN from reference [38] consists of several layers: Convolutional layers, Pooling layer, Flattening layer and fully connected layer, which are shown in Figure 4.

The convolutional layers are the main part of the CNN that can recognize patterns and features from the input file. This layer produces an output called feature maps. The feature maps are generated by applying filters to the input data. They can be used to detect relationships and patterns from the input data [36]. The second layer is the pooling layer. This layer was used to create a subsample by shrinking the larger feature maps into a smaller feature map [37] to reduce the dimensionality and extract the dominant features for efficient training of the model [36]. Then, the flattening layer is used to generate a one-dimensional vector that feeds the final layer of the CNN, the fully connected layer (FC).

The CNN model can be applied to address various challenges encountered in load forecasting. For instance, in study [23], the authors utilized 1D convolutional neural networks to extract valuable features from historical load data sequences. The proposed approach exhibits excellent performance in short-term load forecasting. Similarly, reference [24] introduces a similar study that employs CNN to extract informative features from input data. Subsequently, a CNN-Seq2Seq model with an attention mechanism based on a multi-task learning method is proposed for short-term multi-energy load forecasting.

2.4. Hybrid Deep Learning Model

Various techniques can be employed to develop a hybrid model for time series forecasting. In this comparative investigation, we used two distinct hybrid deep learning models based on the structure of the convolutional neural network (CNN), long short-term memory (LSTM), and gated recurrent unit (GRU). The first model comprises a fusion of the CNN and LSTM networks, whereas the second model integrates the CNN with the GRU network. The CNN–LSTM architecture is composed of a convolutional layer, a pooling layer, a flattening layer, and an LSTM network [14]. The CNN–GRU model shares a similar structure, except for the fact that the LSTM network has been substituted with the GRU network. The hybrid DL model used in this study including CNN–LSTM and CNN–GRU operate in a sequential manner, where the output of one component serves as the input to the next component. The architectural layout of the CNN–LSTM or CNN–GRU hybrid deep learning model is illustrated in Figure 5.

3. Description of Proposed Methodology

To assess the performance of deep learning models in predicting load for one hour-ahead in various circumstances, the methodology employed is presented in Figure 6. This method comprises several distinct stages: data collection, data preprocessing, variational mode decomposes, construction of the forecast model, and model evaluation.

3.1. Data Collection

Data collection, also known as data acquisition, refers to the process of gathering information from various sources. Prior to storing the data in the storage system, it underwent a filtering and cleaning process. In the present study, the focus was on collecting univariate time-series data, specifically from the metallurgy industry in Poland. This dataset consists solely of a single variable, namely electricity consumption. Initially, the data were recorded at a frequency of 15 min using a power quality meter. However, for short-term forecasting, the dataset was artificially resampled into hourly intervals using the MATLAB software library. This resampling was carried out because the research primarily aimed to develop an individual deep learning model for predicting load demand with a one-hour lead time, considering specific conditions. Hence, the methods developed in this study specifically cater to univariate time-series data.

3.2. Data Preprocessing

Data pre-processing refers to the techniques used to prepare and convert raw data from various sources into a format suitable for DL models. The application of data preprocessing is critical because it enables the improvement of data quality by extracting valuable insights from data. In this research, several data pre-processing techniques are used sequentially, starting from data normalization, splitting the dataset, and reshaping the data structure using the sliding window method.

3.2.1. Data Normalization

Datasets often come from various sources, and their parameters may have different units and scales. This variation can affect the performance of DL models during the learning phase and lead to increased generalization error [39]. Therefore, it is necessary to scale all variables within the dataset. DL models perform better when input variables are scaled to a standardized range. The Min–max normalization is a popular technique, which maps the original value of the dataset to a new range [14,40]. The mathematical formula for min–max normalization used in this study is presented in the following Equation (14):

x^{'} = \frac{x - \min (x)}{\max (x) - \min (x)}

(14)

3.2.2. Dataset Splitting

In the dataset splitting stage, the observed values in a time-series dataset are grouped based on their similarity in time conditions. This study incorporated two types of time conditions: seasonal and day category. The seasonal type divides the univariate time series data into sub-datasets representing winter, spring, summer, and autumn, whereas the day category divides them into sub-datasets for working days and weekends. In this study, these sub datasets were used to develop forecasting models under different conditions. Each sub-dataset associated with a specific condition was further split into training, validation, and testing datasets. The primary objective of this process is to prepare a dataset for training the deep learning model and for evaluating and optimizing its performance.

In fact, there is no optimal solution for specifying the percentage of splitting ratio to divide the original dataset into training, validation, and testing datasets. According to the literature, various approaches have been implemented to address dataset-splitting concerns. One such approach is presented in [14], where a ratio of 70% was used for the training dataset, 15% for the validation dataset, and 15% for testing. Other studies, referenced in [41,42], employed a different scenario with a 90% ratio for training and 10% for testing. Based on this literature, our study followed the scenario of 90% for training and 10% for testing. During the development of the training model, 20% of the 90% of the training dataset was allocated to the validation dataset. This allocation is possible because of the availability of an option for model fitting to divide a certain ratio of training data for use as a validation dataset. As a result, this process provides six separate sub-datasets based on selected conditions (winter, spring, summer, autumn, working days, and weekend dataset), and each of them is split further into training dataset with the ratio 90% and 10% for testing dataset.

3.2.3. Sliding Window Approach

As the individual sub-dataset (training and testing dataset of each condition) is still in the form of time series data, it is necessary to reshape the structure into a supervised learning dataset since the deep learning model used in this study deals exclusively with supervised learning problems [8]. The dataset structure must consist of input patterns (X) and output patterns (y). The sliding window approach (see Figure 7) is commonly used for this purpose, where the value of the previous time step serves as the input variable, and the value of the following time step serves as the output variables [38]. In this study, a sliding window with an input width of six and a label width of one was used. Specifically, the last six hours of data were taken as the input to predict the load one hour ahead of the current time (t). Figure 7 illustrates how a sliding window can be used to convert the structure of time series data into a format suitable for supervised learning. The red column represents the input variable (X), which shows the value of the last six hours, and the yellow column represents the output variable (y), which explains the load one hour ahead, while the blue column represents the current time (t). The sliding window method was applied to all subsets of the training and test time-series datasets in this study.

3.3. Variational Mode Decomposition (VMD)

In this study, the VMD method was used to improve the accuracy of the proposed hybrid deep learning model by considering the nonlinear and non-stationary characteristics of the power consumption dataset. This method is known as adaptive [43] and is a data-driven approach for decomposing signals with complex and non-stationary characteristics. The VMD algorithm decomposes the original signal into a set of mode functions that represent the different oscillation components at different frequencies and scales. The iterative solution of an optimization problem generates these modes, where the objective is to minimize the cost function to extract the modes. From the viewpoint of empirical mode decomposition [44], these modes are referred to as signals that exhibit a difference of at most one between the number of local extrema and zero crossings. In subsequent related studies, this definition has been slightly modified and referred to as intrinsic mode function (IMFs).

The main objective of VMD is to construct and deal with the variational problem [45]. This method breaks down a real-valued input signal into a set of sub-signal or modes, denoted as

u_{k}

, with specific sparsity properties while accurately representing the original signal [44]. The approach aims to minimize the total frequency bandwidth while ensuring that the sum of the decomposed modes equals the original input signal. This objective and constraint are depicted in Equation (15):

\begin{matrix} m i n \\ \begin{matrix} \{u k\} & \{ω k\} \end{matrix} \end{matrix} \{\sum_{k = 1}^{N} ‖ϑ_{(t)} [(δ_{(t)} + \frac{j}{π t}) * u_{k} (t)] e^{- j ω_{k} t}‖ \binom{2}{2}\} s . t . \sum_{k = 1}^{N} u_{k} (t) = f (t)

(15)

In the given context, k represents the desired number of modes to be decomposed, which is a positive integer.

\{u k\}

,

\{ω k\}

refer to the k-th modal component and the center frequency, respectively. The function

δ_{(t)}

represents the dirac function, and (*) denotes the convolution operator [43].

To address the reconstruction constraint (refer to Equation (15)), a combination of a quadratic penalty term and Lagrangian multipliers is proposed to make the problem unconstrained. This augmented Lagrange expression is presented in Equation (16):

\begin{array}{l} L (\{u k\}, \{ω k\}, λ) & ∶ = α \sum_{k} ‖ϑ_{(t)} [(δ_{(t)} + \frac{j}{π t}) * u_{k} (t)] e^{- j ω_{k} t}‖ \binom{2}{2} + ‖ f (t) - \sum_{k} u_{k} (t) ‖ \binom{2}{2} \\ + ⟨λ (t), f (t) - \sum_{k} u_{k} (t)⟩ \end{array}

(16)

3.4. Building Forecasting Model

In this phase, distinct basic predictive models are constructed for each specific condition. The primary objective behind the development of these models is to discern and identify the most optimal individual model capable of accurately forecasting load demand with a one-hour lead time under specific conditions. The basic structure of deep learning (DL) model comprises four types of single networks: multilayer perceptron (MLP), long short-term memory (LSTM), gated recurrent unit (GRU), and convolutional neural network (CNN), as well as two hybrid networks, including CNN–LSTM and CNN–GRU. These models were used as the baseline models, whereas our study proposed the integration of VMD with the hybrid model of CNN–LSTM and VMD–CNN–GRU. The Keras and TensorFlow libraries were utilized as the primary frameworks to construct the architecture and layers of all the DL models, as shown in Table 1.

3.5. Model Evaluation

Model evaluation is a method for measuring the accuracy and effectiveness of predictive models using a test dataset. This dataset contained information that was not used during model training. In this study, the separate test dataset was divided into subsets, which were associated with different conditions. To evaluate the predictive models, test data were input into the models to generate predictions. The accuracy of the predictions was measured using three error metrics [40,46]: root mean square error (RMSE), which is presented in Equation (17), mean absolute error (MAE), as shown in Equation (18), and the mean absolute percentage error (MAPE) in Equation (19). The RMSE measures the spread of prediction errors [14,46], while the MAE calculates the average magnitude of prediction errors [33,35], and the MAPE measures the average percentage difference between the predicted value and actual value [47,48]. Smaller values of RMSE, MAE, and MAPE indicate better performance of the prediction model. The mathematical formulas for these error metrics are shown in the equation below:

R M S E = \sqrt{\frac{\sum_{t = 1}^{N} {(y t - \hat{y} t)}^{2}}{N}}

(17)

M A E = \frac{\sum_{t = 1}^{N} |y t - \hat{y} t|}{N}

(18)

M A P E = \frac{100 %}{n} \sum_{t = 1}^{n} |\frac{y t - \hat{y} t}{y t}|

(19)

The actual value and the predicted value at time t are, respectively, denoted as

y t - \hat{y} t

, and N is the sample size of the test data set.

4. Dataset Description

The electricity load data employed in this study were obtained from a metallurgical plant located in Poland. The dataset consists of a univariate time-series data that encompass a single variable, namely electricity consumption, measured in kilowatt units (kW). The data were collected with spans from 1 January 2019 to 31 December 2021. Power consumption data were collected from power quality measurements or smart meters provided by a utility company. Initially, the dataset was recorded at 15-min intervals. The sampling frequency, which represents the number of samples per unit time, can be determined by taking the reciprocal of the time interval. In this study, the sampling frequency was computed as 1 divided by 0.25, representing the conversion of the time interval to hours, resulting in a sampling frequency of 4 samples per hour. Consequently, the power consumption data in this investigation exhibited a sampling frequency of 4 samples per hour. However, because of the objective of this study, which is to predict power consumption one hour ahead under different circumstances, the original time-series dataset was artificially resampled into 1-h granularity. The process of resampling the data from a 15-min interval to a 1-h interval can be accomplished by utilizing various methods, such as aggregating the data. As mentioned in the proposed methodology, the dataset used in this study underwent preprocessing techniques including data normalization, data restructuring using the sliding window method, and dataset splitting. The aim of this process is to provide a suitable dataset for a deep learning model.

The original power consumption pattern within the dataset is depicted in Figure 8, indicating a consistent growth in electricity consumption from year to year. Upon extracting the dataset for year-by-year analysis (refer to Table 2), it was observed that the annual energy consumption escalated from 682 MWh in 2019 to 811.5 MWh in 2020, and further rose to 1190.9 MWh by the end of 2021 (see Figure 9). Furthermore, when considering specific circumstances, the dataset exhibits distinct patterns in relation to day categories and season categories (see Figure 9). Winter accounts for the highest percentage of annual electricity consumption, surpassing 27%, followed by autumn, spring, and summer. Weekdays exhibit higher electricity consumption compared to weekends, with over 80% of energy consumption occurring on weekdays. These trends suggest that load demand experiences seasonal and weekday/weekend fluctuations. Consequently, this study focuses on investigating the hypothesis that the performance of deep learning models for hour-ahead electric load forecasting varies significantly under different conditions in the metallurgy industry. Thus, separate forecasting models are required for each condition to accurately predict load demand.

5. Results and Discussion

In this study, a series of deep learning models encompassing various variants were developed. The objective was to create distinct models that could function autonomously for each predefined condition. Consequently, during the training phase of the experimental development, six deep learning models as the baseline and our proposed model were trained and compared for each season and day category condition. The purpose of this comparison was twofold: first, to determine the optimal variant of the deep learning model for specific conditions and second, to ascertain any significant disparities in the accuracy of hour-ahead electric load forecasting across different deep learning models under varying conditions within the metallurgy industry.

In this study, the Variational Mode Decomposition (VMD) was proposed as a solution to address the nonlinearity and nonstationary characteristics of the dataset. Our proposed approach involves decomposing the training and testing datasets using VMD before inputting the data into our hybrid model, which integrates the CNN–LSTM and CNN–GRU architectures. The VMD function employed in this study was based on the methodology outlined in reference [49]. To configure the input parameters of the VMD function, we set the moderate bandwidth constraint to 2000, noise tolerance to 0, number of modes to 3, and initialized omega uniformly with a value of 1. The output of the VMD algorithm yielded a collection of decomposed modes. Figure 10 illustrates the original time-series dataset presented under different seasonal and day category conditions, accompanied by the collection of decomposed modes revealed by VMD. The dataset shown in Figure 10 represents the testing dataset utilized in our study, which is associated with certain conditions.

Upon decomposing the signal, a thorough analysis was conducted on the resulting decomposed signal obtained through the variational mode decomposition (VMD) to investigate the inherent characteristics of the dataset. Numerous approaches can be employed to select the modes from the VMD. Throughout the developmental stage, a meticulous visual inspection of the decomposed modes enabled us to select Mode 2 as the input signal for our hybrid deep learning model. This visualization-oriented approach, exemplified by the utilization of plots in Figure 10, not only enhances the interpretability of our findings, but also effectively communicates the outcomes of our study.

Prior to feeding the dataset into the models employed in this study, careful consideration must be given to the data size and dimensions. This is because the utilized models were constructed using distinct architectures, resulting in varying requirements for input data treatment. For instance, the MLP model exclusively accepts a 2D array representation of the dataset, whereas other models, such as the LSTM, GRU, CNN, CNN–LSTM, and CNN–GRU, require a distinct dimensional dataset, specifically a 3D array. To accommodate these divergent requirements, the initial data preprocessing stage involves generating training datasets for each condition in a 2D array format, with data components denoted as [samples, time steps]. The number of samples corresponds to the rows in the dataset, and the number of time steps represents the inputs for the model. In this study, the time step value was set to six, utilizing the sliding window method to transform the sequence of time-series data into a supervised learning format. This entailed using the last six hours of the time-series sequence as input to predict the subsequent hour. To meet the requirements of other models that require a 3D array as the input dataset, it is necessary to convert the 2D array into a 3D array, incorporating an additional component referred to as features. The 3D array is structured with dimensions denoted as [samples, time steps, features], where the features correspond to the number of columns in each sample. In this study, the feature value was set to one for the dataset.

During the model development phase, we conducted training and testing of various baseline models, as well as our proposed models, VMD–CNN–LSTM and VMD–CNN–GRU, using diverse datasets under predetermined conditions. In the training stage, it is crucial to configure the hyperparameter settings in deep learning models to regulate and optimize the model’s behavior and performance during the training process. Hyperparameters are predetermined parameters that are not learned from data. To ensure consistency across different deep learning model structures, we applied similar hyperparameter settings to both the proposed and baseline models. Initially, the selection of hyperparameter settings for deep learning models involved a combination of systematic experimentation, following best practices, and utilizing domain knowledge. Therefore, it can be asserted that there is no fixed answer to defining the hyperparameter setting (excluding hyperparameter auto-tuning). In this study, we utilized previous research to establish the configuration of the hyperparameters. For instance, the training optimizer employed was the Adam optimizer, as implemented in [50,51]. The chosen loss function for calculating the error of the deep learning model’s prediction against the provided target value is the mean squared error (MSE), based on [41,52]. Furthermore, the number of epochs was set to 100, as indicated in [40]. The validation split is equally divided with a value of 0.2, referencing study [14]. Additionally, the batch size, which defines the number of samples that must be processed before updating the internal model parameters, was set to a default value of 32.

In this section, we undertake a comparative analysis between our proposed model and the baseline models to elucidate key distinctions. Our analysis centers on two pivotal parameters: the duration required for training and the model evaluation outcomes, encompassing the utilization of statistical metrics, such as RMSE, MAE, and MAPE. The quantification of the training time during the training phase of the deep learning model serves as a reliable gauge for the duration required to train the model on a given dataset. This evaluative measure facilitates the assessment of a model’s efficiency and computational prerequisites, thereby enabling the identification of models that optimally align with specific requirements. Concurrently, the appraisal of model evaluation assumes a vital role in gauging the accuracy and efficacy of predictive models deployed on previously unseen datasets, commonly known as the testing dataset.

Table 3 provides a comprehensive comparison of the computation times for the DL prediction models during the training phase for different datasets. It is evident from Table 3 that the GRU model takes the longest time to train, not only for seasonal forecasts but also for working days and weekend datasets, followed by the LSTM model. This can be attributed to the sequential processing nature of both GRU and LSTM models, wherein each subsequent step relies on the output of the preceding step [14]. Conversely, the Multilayer Perceptron (MLP) model was the fastest DL forecasting model for training. The proposed models, VMD–CNN–LSTM and VMD–CNN–GRU, exhibit training times that fall within a moderate range, occupying a position between the longer durations observed in certain models and the swiftness of the MLP. Nonetheless, upon meticulous examination, these models demonstrate a satisfactory suitability for load forecasting development scenarios across diverse conditions.

Regarding the assessment of model performance, we employed three previously mentioned metrics to evaluate the accuracy of all the trained deep learning (DL) models. These metrics facilitate the comparison of model performance across individual test subsets, allowing for the identification of the most appropriate DL model for load prediction under specific seasonal and day category conditions. During the model evaluation stage, we assessed all baseline models and proposed models that were previously trained and saved, based on their respective conditions, using separate testing datasets tailored to those conditions. As presented in Table 4, our proposed model demonstrated lower scores in every seasonal and day category condition compared to the baseline models. This outcome signifies that the model exhibits good performance in predicting the electrical load of a metallurgy plant one hour-ahead compared to the baseline models.

Specifically, for the winter dataset, the VMD–CNN–LSTM model outperformed the other models, with an RMSE of approximately 16.805 kW and a MAE of 12.634 kW. However, in terms of the MAPE score, the VMD–CNN–GRU model performed better, with a score of 0.153%. For the spring and autumn datasets, the VMD–CNN–LSTM model demonstrated excellent performance compared to the other models under evaluation. Conversely, in the summer dataset, the VMD–CNN–GRU model exhibited outstanding results in terms of both the RMSE metric, with a score of approximately 14.389 kW, and MAE metric, with a score of 11.376 kW. Notably, both the proposed models yielded the same MAPE score (0.116%) in this scenario.

Regarding the categorization of days, the proposed VMD–CNN–LSTM and VMD–CNN–GRU models exhibited superior performance compared with the baseline models, as reflected by lower scores in terms of RMSE, MAE, and MAPE. Specifically, in the working-days dataset, the VMD–CNN–GRU model demonstrated outstanding performance compared to the VMD–CNN–LSTM model, achieving the lowest scores of approximately 12.115 kW for RMSE, 9.818 kW for MAE, and 0.079% for MAPE. Similarly, in the weekend dataset, the VMD–CNN–GRU model continues to outperform, yielding scores of approximately 11.075 kW for RMSE, 8.367 kW for MAE, and 0.137% for MAPE.

If the evaluation results of each type of model are accumulated using an average value based on the metric type, it can be observed that the VMD–CNN–LSTM model has an average MAE of 9543 kW for all seasonal conditions, whereas the VMD–CNN–GRU model has an MAE of 10,453 kW. From an RMSE perspective, the VMD–CNN–LSTM model had an average value of 12.21 kW, whereas the VMD–CNN–GRU model had an average RMSE value of 13.41 kW. Regarding the average MAPE value obtained from aggregating data from each season, that of the VMD–CNN–LSTM model was 0.095%, whereas that of the VMD–CNN–GRU model was 0.10%. Therefore, it can be concluded that the VMD–CNN–LSTM model performs better in predicting the electrical load one hour ahead under different seasonal conditions. When considering the results based on day conditions (average value of working days and weekend result), it is evident that the VMD–CNN–GRU model has an average RMSE value of 11.595 kW, whereas the VMD–CNN–LSTM model has an average of 12.21 kW. In terms of MAE, the VMD–CNN–GRU model achieved an average value of 9.09 kW, whereas the VMD–CNN–LSTM model had an average of 9.54 kW. Regarding the accuracy of the MAPE metric, the average value for the VMD–CNN–GRU model was 0.079%, whereas that for the VMD–CNN–LSTM model was 0.11%. Based on these results, it can be concluded that the VMD–CNN–GRU model demonstrates fairly good performance in predicting the electrical load one hour ahead under day category conditions.

In this experiment, the inclusion of Variational Mode Decomposition (VMD) techniques significantly improves the performance of the hybrid CNN–LSTM and CNN–GRU models. This enhancement is evidenced by achieving lower scores across all metric evaluations compared with the baseline models of CNN–LSTM and CNN–GRU, which do not incorporate VMD. The effectiveness of VMD has been further highlighted in previous studies, such as the work referenced in [53], where VMD was utilized in conjunction with a CNN–LSTM model. The experimental results of that study demonstrated superior performance compared with a recent method employing the same database, achieving an average accuracy of 98.65%. Likewise, in reference [54], the author employed a hybrid VMD–CNN–GRU model for the short-term forecasting of wind power. The proposed model exhibited exceptional performance in short-term forecasting, with notable metrics such as an RMSE of 1.5651, a MAE of 0.8161, a MAPE of 11.62%, and an R2 value of 0.9964. These results demonstrate the effectiveness of the proposed model in accurately predicting the wind power in the short term. Reference [55] introduced an integrated hybrid model called CNN–LSTM–MLP, which incorporates error correction and the variational mode decomposition (VMD). This study claims that the proposed model surpasses numerous conventional alternative approaches in terms of both accuracy and robustness. In reference [56], a novel approach to sparrow search algorithms integrated with VMD-LSTM was presented. The proposed model demonstrates significant enhancements in prediction accuracy and a reduction in wind power prediction errors compared with alternative methods. These findings provide empirical evidence supporting the effectiveness of the proposed prediction model.

Figure 11 shows the forecast result of proposed model of VMD–CNN–LSTM and VMD–CNN–GRU to predict one hour ahead of electrical load of metallurgy plant under different season conditions. Figure 11a illustrates the results of a one-day load forecast during the winter season, covering the period from ‘2021-12-05T08:00:00’ to ‘2021-12-06T07:00:00’. Similarly, Figure 11b presents the prediction outcome of a one-day electrical load forecast in the metallurgy sector during the spring season, spanning from ‘2021-05-04T15:00:00’ to ‘2021-05-05T14:00:00’. Furthermore, Figure 11c provides the one-day forecast of electrical load during the summer season, encompassing the time range from ‘2021-08-04T15:00:00’ to ‘2021-08-05T14:00:00’. Finally, Figure 11d displays the prediction results for electrical load during the autumn season, specifically covering the period from ‘2021-11-03T22:00:00’ to ‘2021-11-04T21:00:00’. The forecast results of the electrical load for different day categories are depicted in Figure 12. This figure displays the one-day forecast of the electrical load. Specifically, Figure 12a presents the forecasted electrical load for a working day (Wednesday), spanning from ‘2021-09-15T01:00:00’ to ‘2021-09-16T00:00:00’. On the other hand, Figure 12b showcases the forecast result of power consumption in a metallurgy plant during the weekend (Sunday), covering the period from ‘2021-09-12T01:00:00’ to ‘2021-09-18T00:00:00’.

6. Conclusions

Electric load forecasting is of paramount importance for power system operation and planning as well as in the electricity market. Thus, there is an urgent need for an efficient approach to solve the load forecasting problem. In this study, the time factor was considered a crucial variable that could affect the load demand patterns. Consequently, the timestamp of the dataset was distinguished, and the data points were grouped based on similar conditions. It was found that the load demand varied according to seasons, working days, and weekends.

The proposed methodology aimed to develop a customized forecasting model to determine the load demand for one hour ahead under certain conditions. In this study we proposed the integration of variational mode decompose with hybrid model of CNN–LSTM and CNN–GRU. The main objective of VMD is to deal with nonlinearity and non-stationary of power consumption data used for deep learning model. While CNN can autonomously derive intricate spatial characteristics from electrical load data, while GRU or LSTM possess the ability to directly derive temporal features from previously recorded input data. Various popular deep learning models—including MLP, LSTM, GRU, CNN, CNN–LSTM, and CNN–GRU—were used in this study as baseline models. These models were compared to our proposed model VMD–CNN–GRU and VMD–CNN–LSTM to assess the performance and effectiveness of our proposed models.

In this study, we conducted a comparative analysis between our proposed model and baseline models to elucidate key distinctions. Our analysis focused on two pivotal parameters: the duration required for training and the outcomes of model evaluation, which encompassed the utilization of statistical metrics such as RMSE, MAE, and MAPE. In the experimental results obtained during the training stage, we observed that the GRU variant of the deep learning model required the longest training duration across diverse datasets. Conversely, the MLP model exhibited the fastest training speed when using all sub-datasets based on different conditions. Based on these observations, we determined that our proposed models, VMD–CNN–LSTM and VMD–CNN–GRU, exhibited training times that fell within a moderate range, occupying a position between the longer durations observed in certain models and the swiftness of MLP. Upon meticulous examination, these models demonstrated satisfactory suitability for load forecasting development scenarios across diverse conditions.

In terms of comparative analysis of model performance, our proposed model consistently outperforms the baseline models, exhibiting lower scores in terms of RMSE, MAE, and MAPE across all datasets. Specifically, the VMD–CNN–LSTM model performs exceptionally well in predicting the electrical load one hour ahead, particularly in seasonal conditions. This is evident from its superior performance, consistently achieving the lowest error metrics across most seasonal datasets. On the other hand, the VMD–CNN–GRU model excels in predicting the electrical load under day categories conditions. Both proposed models demonstrate the advantage of incorporating VMD, which effectively enhances the overall performance. This is evident when comparing them to the baseline models of CNN–LSTM and CNN–GRU, which do not utilize VMD integration.

Based on our experimental findings, we can conclude that the utilization of deep learning models is highly effective for forecasting the electrical load one hour ahead. However, it is crucial to appropriately configure the hyperparameters of these models. Therefore, future research should focus on incorporating automatic hyperparameter tuning techniques to optimize the performance of our forecasting model. Furthermore, the application of the variational mode decomposition (VMD) has demonstrated its effectiveness in signal analysis and modeling, particularly in the context of time series analysis and forecasting tasks. VMD has proven to enhance the accuracy and performance of the models utilized in this study. As a result, we recommend the implementation of the VMD–CNN–LSTM and VMD–CNN–GRU models for one-hour-ahead load forecasting in diverse scenarios, particularly considering seasonal and daily variations. To further enhance the forecasting capabilities, future research should consider the integration of additional factors, such as weather data, and explore the broader applications of deep learning models in predicting electricity consumption.

Author Contributions

Conceptualization, F.A., V.S., P.J. and T.S.; Data curation, F.A., V.S. and T.S.; Formal analysis, F.A., V.S. and P.J.; Funding acquisition, T.S.; Investigation, F.A. and V.S.; Methodology, F.A., V.S. and P.J.; Project administration, T.S.; Resources, F.A., P.J., V.S. and T.S.; Software, F.A. and V.S.; Supervision, F.A., V.S. and P.J.; Validation, V.S. and P.J.; Visualization, F.A. and V.S.; Writing—Original draft, F.A., V.S. and P.J.; Writing—Review and editing, F.A., V.S., P.J. and T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The National Centre for Research and Development in Poland under contract SMARTGRIDSPLUS/4/5/MESH4U/2021 related to the project “Multi Energy Storage Hub For reliable and commercial systems Utilization” (MESH4U), from the funds of ERA-NET Smart Energy System, ERA-NET Smart Grids Plus Joint Call 2019 on Energy Storage Solutions (MICall2019), European SET-Plan Action 4 “Increase the resilience and security of the energy system”.

Data Availability Statement

The data are not publicly available due to the policy of the associate company.

Acknowledgments

This research uses data provided by Alu-Frost company, Sowlany, Poland (Podlasie Voivodeship), the industry partner of the project, location of the project demonstrator. The collection of the data under the supervision of the project leader Electrum Ltd. Białystok, Poland (Podlasie Voivodeship).

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.

References

Zielińska-Sitkiewicz, M.; Chrzanowska, M.; Furmańczyk, K.; Paczutkowski, K. Analysis of Electricity Consumption in Poland Using Prediction Models and Neural Networks. Energies 2021, 14, 6619. [Google Scholar] [CrossRef]
Chow, J.H.; Wu, F.F.; Momoh, J.A. Applied Mathematics for Restructured Electric Power Systems; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1–9. [Google Scholar] [CrossRef]
Szymańska, E.J.; Mroczek, R. Energy Intensity of Food Industry Production in Poland in the Process of Energy Transformation. Energies 2023, 16, 1843. [Google Scholar] [CrossRef]
Lahouar, A.; Ben Hadj Slama, J. Day-Ahead Load Forecast Using Random Forest and Expert Input Selection. Energy Convers. Manag. 2015, 103, 1040–1051. [Google Scholar] [CrossRef]
Li, C.; Hu, R.; Hsu, C.Y.; Han, Y. Short-Term Power Load Forecasting Based on Feature Fusion of Parallel LSTM-CNN. In Proceedings of the 2022 IEEE 4th International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 29–31 July 2022; pp. 448–452. [Google Scholar] [CrossRef]
Zhang, J.; Yan, J.; Infield, D.; Liu, Y.; Lien, F. sang Short-Term Forecasting and Uncertainty Analysis of Wind Turbine Power Based on Long Short-Term Memory Network and Gaussian Mixture Model. Appl. Energy 2019, 241, 229–244. [Google Scholar] [CrossRef] [Green Version]
Tarmanini, C.; Sarma, N.; Gezegin, C.; Ozgonenel, O. Short Term Load Forecasting Based on ARIMA and ANN Approaches. Energy Rep. 2023, 9, 550–557. [Google Scholar] [CrossRef]
Aggarwal, C.C. Neural Networks and Deep Learning; Determination Press: San Francisco, CA, USA, 2018; ISBN 9783319944623. [Google Scholar]
Kubat, M. Neural Networks: A Comprehensive Foundation by Simon Haykin, Macmillan, 1994, ISBN 0-02-352781-7. Knowl. Eng. Rev. 1999, 13, 409–412. [Google Scholar] [CrossRef]
Yu, C.; Li, Y.; Bao, Y.; Tang, H.; Zhai, G. A Novel Framework for Wind Speed Prediction Based on Recurrent Neural Networks and Support Vector Machine. Energy Convers. Manag. 2018, 178, 137–145. [Google Scholar] [CrossRef]
Aslam, M.; Lee, J.M.; Kim, H.S.; Lee, S.J.; Hong, S. Deep Learning Models for Long-Term Solar Radiation Forecasting Considering Microgrid Installation: A Comparative Study. Energies 2019, 13, 147. [Google Scholar] [CrossRef] [Green Version]
Tong, X.; Wang, J.; Zhang, C.; Wu, T.; Wang, H.; Wang, Y. LS-LSTM-AE: Power Load Forecasting via Long-Short Series Features and LSTM-Autoencoder. Energy Rep. 2022, 8, 596–603. [Google Scholar] [CrossRef]
Cecaj, A.; Lippi, M.; Mamei, M.; Zambonelli, F. Comparing Deep Learning and Statistical Methods in Forecasting Crowd Distribution from Aggregated Mobile Phone Data. Appl. Sci. 2020, 10, 6580. [Google Scholar] [CrossRef]
Aksan, F.; Li, Y.; Suresh, V.; Janik, P. CNN-LSTM vs. LSTM-CNN to Predict Power Flow Direction: A Case Study of the High-Voltage Subnet of Northeast Germany. Sensors 2023, 23, 901. [Google Scholar] [CrossRef] [PubMed]
Kim, J.H.; Lee, B.S.; Kim, C.H. A Study on the Development of Long-Term Hybrid Electrical Load Forecasting Model Based on MLP and Statistics Using Massive Actual Data Considering Field Applications. Electr. Power Syst. Res. 2023, 221, 109415. [Google Scholar] [CrossRef]
Saad, Z.; Hazirah, A.J.N.; Suziana, A.; Azhar, M.A.A.; Yaacob, Z.; Ahmad, F.; Yusnita, M.A. Short-Term Load Forecasting of 415V, 11kV and 33kV Electrical Systems Using MLP Network. In Proceedings of the 2017 International Conference on Robotics, Automation and Sciences (ICORAS), Melaka, Malaysia, 27–29 November 2017; pp. 1–5. [Google Scholar] [CrossRef]
Fekri, M.N.; Patel, H.; Grolinger, K.; Sharma, V. Deep Learning for Load Forecasting with Smart Meter Data: Online Adaptive Recurrent Neural Network. Appl. Energy 2021, 282, 116177. [Google Scholar] [CrossRef]
Yahya, M.A.; Hadi, S.P.; Putranto, L.M. Short-Term Electric Load Forecasting Using Recurrent Neural Network. In Proceedings of the 2018 4th International Conference on Science and Technology (ICST), Yogyakarta, Indonesia, 7–8 August 2018; Volume 1, pp. 1–6. [Google Scholar]
Mujeeb, S.; Javaid, N.; Akbar, M.; Khalid, R.; Nazeer, O.; Khan, M. Big Data Analytics for Price and Load Forecasting in Smart Grids; Springer International Publishing: Cham, Switzerland, 2019; Volume 25, ISBN 9783030026134. [Google Scholar]
Bashir, T.; Haoyong, C.; Tahir, M.F.; Liqiang, Z. Short Term Electricity Load Forecasting Using Hybrid Prophet-LSTM Model Optimized by BPNN. Energy Rep. 2022, 8, 1678–1686. [Google Scholar] [CrossRef]
Li, D.; Sun, G.; Miao, S.; Gu, Y.; Zhang, Y.; He, S. A Short-Term Electric Load Forecast Method Based on Improved Sequence-to-Sequence GRU with Adaptive Temporal Dependence. Int. J. Electr. Power Energy Syst. 2022, 137, 107627. [Google Scholar] [CrossRef]
Inteha, A.; Nahid-Al-Masood. A GRU-GA Hybrid Model Based Technique for Short Term Electrical Load Forecasting. In Proceedings of the 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh, 5–7 January 2021; pp. 515–519. [Google Scholar] [CrossRef]
Kaligambe, A.; Fujita, G. Short-Term Load Forecasting for Commercial Buildings Using 1D Convolutional Neural Networks. In Proceedings of the 2020 IEEE PES/IAS PowerAfrica, Nairobi, Kenya, 25–28 August 2020; pp. 20–24. [Google Scholar] [CrossRef]
Zhang, G.; Bai, X.; Wang, Y. Short-Time Multi-Energy Load Forecasting Method Based on CNN-Seq2Seq Model with Attention Mechanism. Mach. Learn. Appl. 2021, 5, 100064. [Google Scholar] [CrossRef]
Rafi, S.H.; Al-Masood, N.; Deeba, S.R.; Hossain, E. A Short-Term Load Forecasting Method Using Integrated CNN and LSTM Network. IEEE Access 2021, 9, 32436–32448. [Google Scholar] [CrossRef]
Sajjad, M.; Khan, Z.A.; Ullah, A.; Hussain, T.; Ullah, W.; Lee, M.Y.; Baik, S.W. A Novel CNN-GRU-Based Hybrid Approach for Short-Term Residential Load Forecasting. IEEE Access 2020, 8, 143759–143768. [Google Scholar] [CrossRef]
Wilde, P.D. Neural Network Models; Springer: London, UK, 1997; ISBN 978-1-84628-614-8. [Google Scholar]
Haykin, S. Neural Networks and Learning Machines, 3rd ed.; Pearson: London, UK, 2008; ISBN 978-0131471399. [Google Scholar]
Haykin, S. Neural Network a Comprehensive Foundation; Macmillan: New York, NY, USA, 1994; Volume 2, ISBN 9780132265560. [Google Scholar]
Fine, T.L. Feedforward Neural Network Methodology, 1st ed.; Springer: Berlin/Heidelberg, Germany, 1999; ISBN 978-0-387-98745-3. [Google Scholar]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: www.deeplearningbook.org (accessed on 8 July 2023).
Géron, A. Hands-On Machine Learning with Scikit-Learn and TensorFlow, 1st ed.; Nicole, T., Ed.; O’Reilly Media, Inc.: Sevastopol, CA, USA, 2017. [Google Scholar]
Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A Review of Wind Speed and Wind Power Forecasting with Deep Neural Networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
Hochreiter, S. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Kumari, P.; Toshniwal, D. Deep Learning Models for Solar Irradiance Forecasting: A Comprehensive Review. J. Clean. Prod. 2021, 318, 128566. [Google Scholar] [CrossRef]
Aslam, S.; Herodotou, H.; Mohsin, S.M.; Javaid, N.; Ashraf, N.; Aslam, S. A Survey on Deep Learning Methods for Power Load and Renewable Energy Forecasting in Smart Microgrids. Renew. Sustain. Energy Rev. 2021, 144, 110992. [Google Scholar] [CrossRef]
Ghosh, A.; Sufian, A.; Sultana, F.; Chakrabarti, A.; De, D. Fundamental Concepts of Convolutional Neural Network; Springer: Berlin/Heidelberg, Germany, 2019; Volume 172, ISBN 9783030326449. [Google Scholar]
Suresh, V.; Janik, P.; Rezmer, J.; Leonowicz, Z. Forecasting Solar PV Output Using Convolutional Neural Networks with a Sliding Window Algorithm. Energies 2020, 13, 723. [Google Scholar] [CrossRef] [Green Version]
Bhanja, S.; Das, A. Impact of Data Normalization on Deep Neural Network for Time Series Forecasting. arXiv 2018, arXiv:1812.05519. [Google Scholar]
Aksan, F.; Li, Y.; Suresh, V. Multistep Forecasting of Power Flow Based on LSTM Autoencoder: A Study Case in Regional Grid Cluster Proposal. Energies 2023, 16, 5014. [Google Scholar] [CrossRef]
Wang, K.; Qi, X.; Liu, H. A Comparison of Day-Ahead Photovoltaic Power Forecasting Models Based on Deep Learning Neural Network. Appl. Energy 2019, 251, 113315. [Google Scholar] [CrossRef]
Wang, K.; Qi, X.; Liu, H. Photovoltaic Power Forecasting Based LSTM-Convolutional Network. Energy 2019, 189, 116225. [Google Scholar] [CrossRef]
Yu, P.; Fang, J.; Xu, Y.; Shi, Q. Application of Variational Mode Decomposition and Deep Learning in Short-Term Power Load Forecasting. J. Phys. Conf. Ser. 2021, 1883, 012128. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Jing, X.; Luo, J.; Zhang, S.; Wei, N. Runoff Forecasting Model Based on Variational Mode Decomposition and Artificial Neural Networks. Math. Biosci. Eng. 2022, 19, 1633–1648. [Google Scholar] [CrossRef]
Suresh, V.; Aksan, F.; Janik, P.; Sikorski, T.; Sri Revathi, B. Probabilistic LSTM-Autoencoder Based Hour-Ahead Solar Power Forecasting Model for Intra-Day Electricity Market Participation: A Polish Case Study. IEEE Access 2022, 10, 110628–110638. [Google Scholar] [CrossRef]
Farsi, B.; Amayri, M.; Bouguila, N.; Eicker, U. On Short-Term Load Forecasting Using Machine Learning Techniques and a Novel Parallel Deep LSTM-CNN Approach. IEEE Access 2021, 9, 31191–31212. [Google Scholar] [CrossRef]
Velasco, L.C.P.; Arnejo, K.A.S.; Macarat, J.S.S. Performance Analysis of Artificial Neural Network Models for Hour-Ahead Electric Load Forecasting. Procedia Comput. Sci. 2021, 197, 16–24. [Google Scholar] [CrossRef]
Carvalho, V.R.; Moraes, M.F.D.; Braga, A.P.; Mendes, E.M.A.M. Evaluating Five Different Adaptive Decomposition Methods for EEG Signal Seizure Detection and Classification. Biomed. Signal Process. Control 2020, 62, 102073. [Google Scholar] [CrossRef]
Wu, Y.; Wu, Q.; Zhu, J. Data-Driven Wind Speed Forecasting Using Deep Feature Extraction and LSTM. IET Renew. Power Gener. 2019, 13, 2062–2069. [Google Scholar] [CrossRef]
Jahangir, H.; Aliakbar, M.; Alhameli, F.; Mazouz, A.; Ahmadian, A.; Elkamel, A. Short-Term Wind Speed Forecasting Framework Based on Stacked Denoising Auto-Encoders with Rough ANN. Sustain. Energy Technol. Assess. 2020, 38, 100601. [Google Scholar] [CrossRef]
Wang, F.; Xuan, Z.; Zhen, Z.; Li, K.; Wang, T.; Shi, M. A Day-Ahead PV Power Forecasting Method Based on LSTM-RNN Model and Time Correlation Modification under Partial Daily Pattern Prediction Framework. Energy Convers. Manag. 2020, 212, 112766. [Google Scholar] [CrossRef]
Fakhry, M. Variational Mode Decomposition and a Light CNN-LSTM Model for Classification of Heart Sound Signals. 2023. Available online: https://www.researchgate.net/profile/Mahmoud-Fakhry-2/publication/369439477_Variational_Mode_Decomposition_and_a_Light_CNN-LSTM_Model_for_Classification_of_Heart_Sound_Signals/links/641b355192cfd54f842063af/Variational-Mode-Decomposition-and-a-Light-CNN-LSTM-Model-for-Classification-of-Heart-Sound-Signals.pdf (accessed on 8 July 2023).
Zhao, Z.; Yun, S.; Jia, L.; Guo, J.; Meng, Y.; He, N.; Li, X.; Shi, J.; Yang, L. Hybrid VMD-CNN-GRU-Based Model for Short-Term Forecasting of Wind Power Considering Spatio-Temporal Features. Eng. Appl. Artif. Intell. 2023, 121, 105982. [Google Scholar] [CrossRef]
Liu, J.; Huang, X.; Li, Q.; Chen, Z.; Liu, G.; Tai, Y. Hourly Stepwise Forecasting for Solar Irradiance Using Integrated Hybrid Models CNN-LSTM-MLP Combined with Error Correction and VMD. Energy Convers. Manag. 2023, 280, 116804. [Google Scholar] [CrossRef]
Gao, X.; Guo, W.; Mei, C.; Sha, J.; Guo, Y.; Sun, H. ScienceDirect Short-Term Wind Power Forecasting Based on SSA-VMD-LSTM. Energy Rep. 2023, 9, 335–344. [Google Scholar] [CrossRef]

Figure 1. The structure of MLP.

Figure 2. Unrolled RNN.

Figure 3. The structure of (a) LSTM; (b) GRU.

Figure 4. CNN layers.

Figure 5. Hybrid deep learning model structure.

Figure 6. Proposed methodology.

Figure 7. Sliding window approach.

Figure 8. Power consumption.

Figure 9. Annual energy consumption.

Figure 10. Original signal and decomposed models of testing dataset based on the specific condition. (a) winter. (b) spring. (c) summer. (d) autumn. (e) working day. (f) weekend.

Figure 11. One day load forecast under different season condition. (a) winter. (b) spring. (c) summer. (d) autumn.

Figure 12. One day load forecast under different day categories condition. (a) working day. (b) weekend.

Table 1. Deep learning model structures.

Forecasting Models	Model Structure
MLP	FC layer(100 neurons, activation = ‘relu’) + FC layer(50 neurons, activation = ‘relu’)
LSTM	2 LSTM layers (15 neurons + activation = ‘relu’) + FC layer (50 neurons + activation = ‘relu’)
GRU	1 GRU layers (15 neurons) + 1 GRU layers (15 neurons, activation = ‘relu’)
CNN	2 Conv1D layers (32 filters + 3 filter size + activation = ‘relu’) + MaxPooling1D (2 pooling size) + Flatten layer
CNN-LSTM	2 Conv1D layers (32 filters + 3 filter size +activation = ‘relu’) + MaxPooling1D (2 pooling size) + Flatten layer + 1 LSTM layers (15 neurons + activation = ‘relu’)
CNN-GRU	2 Conv1D layers (32 filters + 3 filter size + activation = ‘relu’) + MaxPooling1D (2 pooling size) + Flatten layer + 1 GRU layers (15 neurons + activation = ‘relu’)

Table 2. Statistical description of electrical load dataset.

Year	Condition	Number of Data Point	Min [kW]	Mean [kW]	Max [kW]
2019	Winter	2136	4.86	89.6	251.34
	Spring	2207	4.71	73.29	226.77
	Summer	2208	4.5	67.06	184.62
	autumn	2185	4.32	82.77	262.41
	Working day	6240	4.86	90.70	262.41
	weekend	2496	4.32	46.53	170.31
2020	Winter	2160	8.07	113.61	364.89
	Spring	2207	4.32	79.16	255.39
	Summer	2208	4.53	77.26	176.34
	autumn	2185	6.57	101.07	339.24
	Working day	6264	5.16	112.46	364.89
	weekend	2496	4.32	42.91	273.15
2021	Winter	2160	10.38	157.35	453.9
	Spring	2207	16.32	129.7	370.92
	Summer	2208	19.08	116.31	285.69
	autumn	2184	15.36	141.01	361.95
	Working day	6264	17.22	159.92	453.9
	weekend	2495	10.38	75.83	265.02

Table 3. Computational time of DL during training stage.

DL Models	Training Time [s]
DL Models	Winter	Spring	Summer	Autumn	Working Day	Weekend
MLP	29.143	28.093	25.169	25.983	78.615	36.82
LSTM	92.823	85.917	85.456	87.381	254.36	107.855
GRU	108.459	99.565	98.665	102.724	319.193	135.598
CNN	31.152	28.723	27.927	29.089	93.256	44.678
CNN-LSTM	43.686	40.219	40.292	41.237	137.742	65.437
CNN-GRU	38.539	41.394	40.32	42.14	203.916	83.274
VMD–CNN–LSTM	77.465	84.121	84.269	84.12	204.108	68.36
VMD–CNN–GRU	84.472	84.259	60.182	59.409	178.587	84.396

Table 4. Model evaluation result based on separate testing dataset.

Season Condition	Metric Evaluation	Forecasting Model
		Baseline						Proposed
		MLP	LSTM	GRU	CNN	CNN-LSTM	CNN-GRU	VMD–CNN–LSTM	VMD–CNN–GRU
Winter	RMSE	54.194	53.71	50.229	52.069	54.978	51.63	16.805	21.787
	MAE	37.169	36.188	34.03	35.319	38.789	35.733	12.634	16.303
	MAPE	0.285	0.273	0.287	0.285	0.31	0.307	0.12	0.153
Spring	RMSE	34.187	36.139	34.939	33.509	33.365	32.489	4.876	4.894
	MAE	22.771	23.737	22.502	22.482	21.575	21.428	3.824	3.826
	MAPE	0.225	0.211	0.21	0.226	0.209	0.21	0.04	0.041
Summer	RMSE	39.063	39.106	37.62	38.285	40.215	39.054	14.607	14.389
	MAE	25.442	25.333	24.543	25.33	26.81	25.579	11.444	11.376
	MAPE	0.198	0.197	0.202	0.207	0.209	0.203	0.116	0.116
Autumn	RMSE	47.122	48.504	45.608	46.376	46.97	48.372	12.574	12.598
	MAE	31.762	32.519	30.577	30.873	32.544	33.151	10.272	10.31
	MAPE	0.257	0.239	0.264	0.245	0.268	0.26	0.105	0.105
Working day	RMSE	49.804	48.423	49.575	51.609	51.26	49.433	12.481	12.115
	MAE	35.007	33.324	34.247	37.037	35.597	34.881	9.987	9.818
	MAPE	0.228	0.225	0.226	0.241	0.231	0.24	0.081	0.079
Weekend	RMSE	24.192	23.402	23.369	23.537	23.796	23.969	11.939	11.075
	MAE	15.179	14.507	14.406	14.811	14.919	15.429	9.107	8.367
	MAPE	0.234	0.228	0.233	0.237	0.241	0.235	0.15	0.137

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aksan, F.; Suresh, V.; Janik, P.; Sikorski, T. Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models. Energies 2023, 16, 5381. https://doi.org/10.3390/en16145381

AMA Style

Aksan F, Suresh V, Janik P, Sikorski T. Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models. Energies. 2023; 16(14):5381. https://doi.org/10.3390/en16145381

Chicago/Turabian Style

Aksan, Fachrizal, Vishnu Suresh, Przemysław Janik, and Tomasz Sikorski. 2023. "Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models" Energies 16, no. 14: 5381. https://doi.org/10.3390/en16145381

APA Style

Aksan, F., Suresh, V., Janik, P., & Sikorski, T. (2023). Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models. Energies, 16(14), 5381. https://doi.org/10.3390/en16145381

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Load Forecasting for the Laser Metal Processing Industry Using VMD and Hybrid Deep Learning Models

Abstract

1. Introduction

1.1. Background

1.2. Related Work and Contribution

2. Deep Learning Description

2.1. Multilayer Perceptron (MLP)

2.2. Recurrent Neural Network (RNN)

2.3. Convolutional Neural Network (CNN)

2.4. Hybrid Deep Learning Model

3. Description of Proposed Methodology

3.1. Data Collection

3.2. Data Preprocessing

3.2.1. Data Normalization

3.2.2. Dataset Splitting

3.2.3. Sliding Window Approach

3.3. Variational Mode Decomposition (VMD)

3.4. Building Forecasting Model

3.5. Model Evaluation

4. Dataset Description

5. Results and Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI