1. Introduction
As the economy develops rapidly, ports have become critical nodes for domestic logistics and international trade, leading to an increasing demand for energy. Traditional fossil fuels are causing severe environmental pollution. There is a pressing need to move toward more sustainable energy management techniques because ports must respond to ecological pressures by integrating sustainable considerations into port operation activities to promote a green transition [
1]. Major ports worldwide have been electrified to varying degrees in terms of energy. A large number of studies have shown that electrification has a significant effect on improving energy efficiency in ports [
2]. Electricity accounts for an increasing proportion of energy consumption in ports, making energy management and optimization in ports particularly important. In recent years, the growing usage of renewable energy sources such as wind and solar power in ports, the development of energy substitution technologies, virtual power plants, and energy storage technologies has necessitated the need for accurate short-term forecasting of port load [
3,
4]. Energy storage technologies have an important role to play during peak load hours in ports and by discharging during peak demand and charging during low demand, which can balance the energy supply and demand and maintain system stability [
5]. Accurate port load forecasting and fast response capabilities enable energy storage systems to optimize dispatch and improve the reliability and economics of port area power systems.
The traditional load forecasting methods include time series analysis [
6] and regression analysis [
7]. The Autoregressive Integrated Moving Average (ARIMA) time series forecasting model has gained popularity due to its exceptional ability to handle both smooth and non-smooth series. Nano et al. [
8] used the cuckoo search algorithm (CS) to optimize the parameters of the ARIMA model to predict the actual power load data, and the results proved that ARIMA showed high accuracy in predicting the short-term power load. Jeong et al. [
9] demonstrated good accuracy in multivariate time series forecasting while utilizing the Vector Autoregressive (VAR) model to predict building electrical loads. By taking data analysis and selecting logistic regression as the basic model, Feng et al. [
10] proposed and developed a load forecasting method based on the combination of clustering and iterative logistic regression. Wu et al. [
11] proposed an improved regression model based on mini-batch stochastic gradient descent to address the issues of slow prediction speed and low prediction accuracy in regression analysis models. The results demonstrated that the modified algorithm achieves significant improvement in prediction speed. The theoretical system of traditional load forecasting methods is relatively mature, and the calculation is simple. However, the prediction effect is unstable, and the accuracy is poor when dealing with high complexity and nonlinearity data, such as port load data [
12,
13]. As it is difficult to collect the reliable data required for predicting power loads in port areas, traditional models are difficult to adapt to rapidly changing environmental factors and complex port operations.
In recent years, deep learning models have become powerful tools for addressing such problems due to their excellent non-linear fitting ability and adaptability. Typical representatives are the long short-term memory (LSTM) network [
14,
15], the gated recurrent unit (GRU) [
16], DeepAR [
17], N-BEATS [
18], transformers [
19], etc. These methodologies exhibit proficiency in modeling complex nonlinear load dynamics, demonstrating an enhanced adaptability to nonlinear fluctuations and an improved ability to precisely capture the load data patterns. LSTM has gained widespread attention for its exceptional performance in predicting power load time series, and many researchers have improved the basic LSTM model. Buratto et al. [
20] proposed a Seq2Seq-LSTM model based on an attention mechanism to predict Brazil’s electricity load, achieving better capture of long-range dependencies in load sequences. Sheng et al. [
21] proposed an improved residual LSTM-based framework for solving the short-term load forecasting problem, which avoids the problem of gradient vanishing when training deep neural networks. GRU is a simplified version of LSTM, reducing the model’s complexity and computational overhead. Wang et al. [
22] utilized GRU with a gorilla troop optimizer (GTO) to predict and optimize energy consumption in the HVAC systems of smart buildings. The GTO is employed to tune the parameters of the GRU model, enhancing the accuracy predictions. In addition, some researchers [
23] combined LSTM with convolutional neural networks (CNNs) to develop a hybrid cross-channel CNN-LSTM model for smart grid load forecasting, which improved prediction efficiency and accuracy compared to a single model. Generally, these combined models, which integrate the advantages of multiple models, typically show higher accuracy. Temporal convolutional networks (TCNs) capture local features of sequence data through stacked convolutional layers, and dilated convolutions are used to effectively increase the receptive field, enabling the network to capture longer-range dependencies [
24]. Zheng et al. [
25] utilized TCNs and the Global Attention Mechanism (GAT) to model and process load time series data, improving model prediction accuracy by filtering input variables using Shapley Additive Explanation (SHAP) values.
To further address the strong volatility of load data and fully explore internal features, some researchers use modal decomposition algorithms such as empirical mode decomposition (EMD) [
26], ensemble empirical mode decomposition (EEMD) [
27], wavelet decomposition [
28], and variational mode decomposition (VMD) [
29] to decompose and smooth the load data before deep learning model training. However, EMD and EEMD are prone to the mode-mixing problem, which can lead to significant errors in decomposition [
30]. VMD can effectively avoid the mode-mixing problem through variational optimization. At the same time, it is more capable of adapting to the complex components of the sequence than wavelet decomposition, providing richer features for deep learning model prediction [
31].
The energy demand at ports is significantly influenced by traffic demand, exhibiting notable differences across various periods and demonstrating strong temporal regularity. Additionally, it is affected by various environmental factors, especially meteorological elements such as wind speed and temperature. High wind speeds can restrict the operation of cranes and other loading and unloading equipment, thereby impacting the efficiency of cargo handling [
32]. Temperature fluctuations, on the other hand, directly affect the energy consumption in the port area, such as cooling or heating demands. To address the nonlinear issues and complex multi-faceted influences in port load forecasting, deep learning algorithms have been proven effective. The integration of modal decomposition algorithms, called VMD, with deep learning offers a promising solution for efficient energy management in port operations. In pursuit of developing an accurate and efficient short-term port load forecasting model, this paper makes the following contributions:
A VMD-TCN-LSTM model is proposed for port load forecasting. By leveraging VMD to mitigate data volatility and extract features of varying frequencies, along with the integration of TCN and LSTM, the model can effectively capture temporal patterns and long-term dependencies;
By leveraging multi-feature modeling, we enhance the prediction accuracy of the port power load forecasting model by considering various feature variables such as the temperature, the 10 m wind speed, the quarter, and the hour as input to the model;
Using real port load data, a case study was performed. The proposed model’s superiority was confirmed through comparative experiments with other widely used load forecasting models.
The structure of the paper is as follows:
Section 1 provides an introduction to the research background and related literature on short-term port load forecasting.
Section 2 describes the research objective and the theoretical methods adopted for short-term port load forecasting.
Section 3 focuses on data preprocessing, feature selection, and applying VMD decomposition to the original load data.
Section 4 analyzes the data and compares various models with specific cases to verify the effectiveness of the proposed forecasting model.
Section 5 discusses the limitations of this study and suggests future research directions. Finally, the conclusion section summarizes the main contributions of this study.
4. Case Study
4.1. Dataset Processing
The experimental dataset includes load data recorded by smart meters in a coastal port in China for the whole port area from 1 January 2021 to 31 March 2022. Additionally, port meteorological temperature data, 10 m wind speed data, sunshine clarity index data with the load data sampling times were collected from the NASA website.
First, the dataset was divided into a training set and a testing set, with 80% of the data allocated for training and 20% for testing. Next, the continuous variables in both the training and testing sets were standardized using the z-score method to eliminate differences in measurement units among feature variables, as shown in Equation (20). Discrete data, such as temporal feature factors, were processed using the one-hot encoding to avoid interference from the magnitude relationships among features in the model training. The encoding method is illustrated in
Table 5. Finally, time-sliding windows were set for both the training and testing sets. The training set was then fed into the TCN-LSTM model for training. After the model weights were trained, the model was validated on the testing set, and the relevant evaluation metrics were calculated.
4.2. Hyperparameter Tuning of Forecasting Model
In this study, the random search (RS) algorithm was used to optimize the hyperparameters of the TCN-LSTM model. Unlike grid search, random search does not exhaustively explore the predefined parameter space. Instead, it randomly samples from the parameter space based on the performance metrics predicted on the validation set to find the optimal hyperparameter combination. The main advantage of this method is that it can efficiently explore a wide range of parameter spaces, reducing computational costs.
The TCN-LSTM model requires the tuning of both structural hyperparameters and training hyperparameters. This study utilized the advantages of RS to optimize some key hyperparameters to reduce the complexity of model tuning and improve training speed. The structural hyperparameters of the TCN include the time window size, the number of convolutional layers, and the kernel size. For LSTM, the hyperparameters include the number of LSTM layers and the number of units per layer. The training hyperparameters encompass the initial learning rate and the dropout rate. Dropout is a regularization technique used during the training of deep neural networks, which randomly sets a portion of the neuron outputs to zero to reduce inter-neuron dependency, thereby preventing overfitting to some extent. The learning rate decay strategy adopted is the cosine annealing method, the activation function and optimizer are ReLU and Adam, with excellent performance, the loss function is the MSE, and the training process uses the early stopping method to prevent model overfitting, with the tolerance set to 10. The early stopping method prevents model overfitting by monitoring the performance on the validation set during model training and stopping training early when the validation performance stops improving or starts to get worse.
In summary, this study utilized the RS algorithm to optimize the above structural and training hyperparameters of the TCN-LSTM model. Finally, the model with the optimal combination of hyperparameters was used to forecast the load.
4.3. Comparison of Prediction Effects Based on Multifeature
To verify the enhancement of model prediction performance by incorporating multiple features, the TCN-LSTM model was used to conduct predictions under the following two scenarios: (1) a univariate prediction based solely on historical load data, without utilizing the VMD algorithm for decomposition; (2) a multivariate prediction based on multiple features (temperature, 10 m wind speed, the sunshine clarity index, quarters, hours), but without decomposition of the historical load data through the VMD algorithm.
The models of the two scenarios were tuned for hyperparameters using the random search algorithm and the early stopping method. The hyperparameter combination that performed best on the validation set was selected for prediction on the test set. The hyperparameter optimization results are shown in
Table 6. By combining the learning loss curves of the models with the optimal hyperparameter combinations under the two scenarios in
Figure 12, it is found that the learning loss curves of the validation set do not have a significant upward trend when the model terminates the training and do not exhibit overfitting.
Once training was completed, the models were used to predict the load on the test set. The predictions for the load of a specific port over two days, including 24 sampling points, are illustrated in
Figure 13. The overall evaluation metrics for the test set predictions are presented in
Table 7. After hyperparameter optimization through random search and preventing model overfitting through the early stopping method, the TCN-LSTM model with multi-feature input better fits the real load values and trends of the testing set efficiently. Furthermore, all performance indicators surpass those achieved in the prediction scenario (1), demonstrating that introducing multiple feature variables can effectively enhance the TCN-LSTM model’s accuracy in predicting port electricity load, as shown in
Table 7.
4.4. Comparative Evaluation of Decomposition Algorithms
As observed from
Figure 13, the inherent high volatility and unpredictability of the load sequence can lead to significant errors during peak and trough periods. This study employed decomposition algorithms to mine deep feature information within the load sequence to enhance prediction accuracy, thereby improving the model’s adaptability to load fluctuations. To determine which decomposition algorithm is more suitable for uncovering the internal characteristics of port load sequences, this study conducted predictions using the TCN-LSTM model combined with the VMD, EEMD, and CEEMDAN decomposition algorithms. The prediction results of each model on the test set after training are shown in
Figure 14, with the evaluation metrics provided in
Table 8.
Compared to the evaluation metrics of the TCN-LSTM model presented in
Table 7, all three modal decomposition algorithms have improved the prediction accuracy of the TCN-LSTM model. However, due to the inability of EEMD to avoid the mode-mixing problem during decomposition, it falls short in extracting sufficient features from the sequence data, leading to subpar prediction accuracy. In contrast, the VMD-TCN-LSTM model significantly outperforms both the EEMD-TCN-LSTM and CEEMDAN-TCN-LSTM models in terms of R
2, MSE, and MAPE. Moreover, it better fits the actual load values during peak and trough periods, indicating that the VMD decomposition technology, relative to EEMD and CEEMDAN, can more effectively reduce the random volatility of load signals. This improves the model’s prediction performance and enables the efficient mining of the load data’s internal feature information.
4.5. Comparison of Different Prediction Models
The VMD-TCN-LSTM model proposed in this study is also compared with other commonly used time series prediction models, including the GRU, LSTM, XGBoost, and VMD-LSTM, to analyze their prediction performance. The multi-feature input method was used for all models. The results are illustrated in
Figure 15, with the evaluation metrics detailed in
Table 9.
The comparison reveals significant advantages of the model proposed in this study across key evaluation metrics, including R2, the MSE, and the MAPE. When handling actual load data, the VMD-TCN-LSTM model, compared to the single LSTM model, shows a reduction in the MAPE by 4.99%, a decrease in the MSE by 33.38, and R2 is improved by 0.43. This confirms the effectiveness of the TCN model in mining the potential temporal features of load sequences. Furthermore, the stabilization process of load sequences via VMD can further enhance the model’s prediction performance, affirming the significant role of executing VMD decomposition in boosting the model’s predictive capability. Thus, the short-term port load forecasting method proposed in this study demonstrates superior predictive performance.
5. Discussion
This study validates the proposed predictive model through a practical case analysis. The results demonstrate that the model can effectively predict short-term load trends in ports, which is crucial for sustainable energy management and planning in these areas.
By extracting key features across different frequencies from the original load data, the VMD method has proven to be effective in processing port load data. Moreover, while previous studies employing single predictive models have achieved acceptable results, combining the strengths of multiple models may reveal greater potential in future research. Exploring how to effectively integrate different predictive models will be an important direction for future load forecasting research. Additionally, analyzing key features that influence port load is also critical for port load forecasting. Choosing appropriate and highly relevant feature inputs can effectively enhance the model’s accuracy.
Finally, this study also has some shortcomings. Usually, to ensure the generalization ability of the model, deep learning requires a relatively large training sample size. The port’s actual load data used in this study may be insufficient. Therefore, it is essential to collect longer period load series data to further validate and optimize the model in future studies. However, the superiority of the forecasting method proposed in this study can also be demonstrated by comparing the experimental results with other common forecasting models in the case study. In addition, the load data of this study are from the coastal ports of China. Considering the differences in operation between coastal and inland ports, whether the model can be directly applied to inland ports still needs to be further verified. Future studies should be extended to different types of ports to enhance its applicability.