Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction

Ji, Xiu; Lu, Chengxiang; Xie, Beimin; Han, Huanhuan; Li, Mingge

doi:10.3390/electronics14030403

Open AccessArticle

Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction

by

Xiu Ji

^1,*,

Chengxiang Lu

²,

Beimin Xie

³,

Huanhuan Han

¹ and

Mingge Li

¹

Changchun Institute of Technology Institute of the Future Innovative Industry and Technology, Changchun 130000, China

²

School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun 130000, China

³

State Grid Jilin Electric Power Co., Ltd., Ultra High Voltage Company, Changchun 130000, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(3), 403; https://doi.org/10.3390/electronics14030403

Submission received: 10 January 2025 / Revised: 15 January 2025 / Accepted: 16 January 2025 / Published: 21 January 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

As power demand increases and the scale of power grids expands, accurately predicting transmission line temperatures is becoming essential for ensuring the stability and security of power systems. Traditional physical and statistical models struggle with complex multivariate time series, often failing to balance short-term fluctuations with long-term dependencies, and their prediction accuracy and adaptability remain limited. To address these challenges, this paper proposes a deep learning model architecture based on the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA), named the DA-AHA-CNN-LSTM-TPA (DA-AHA-CLT). The model integrates convolutional neural networks (CNNs) for local feature extraction, long short-term memory (LSTM) networks for temporal modeling, and temporal pattern attention mechanisms (TPA) for dynamic feature weighting, while the DA-AHA optimizes hyperparameters to enhance prediction accuracy and stability. The traditional artificial hummingbird algorithm (AHA) is further improved by introducing dynamic step-size adjustment, greedy local search, and grouped parallel search mechanisms to balance global exploration and local exploitation. Our experimental results demonstrate that the DA-AHA-CLT model achieves a coefficient of determination (R²) of 0.987, a root-mean-square error (RMSE) of 0.023, a mean absolute error (MAE) of 0.018, and a median absolute error (MedAE) of 0.011, outperforming traditional models such as CNN-LSTM and LSTM-TPA. These findings confirm that the DA-AHA-CLT model effectively captures the complex dynamic characteristics of transmission line temperatures, offering superior performance and robustness in full-time-step prediction tasks, and highlight its potential for solving challenging multivariate time-series forecasting problems in power systems.

Keywords:

transmission line temperature; temporal pattern attention; DA-AHA; CLT

1. Introduction

As global power demand continues to rise, the scale and complexity of power grids have grown significantly, making the accurate prediction of transmission line temperatures an increasingly important area of research for ensuring the stability and security of power systems [1]. Transmission line temperature is influenced not only by load factors such as current and voltage, but also by external conditions such as ambient temperature, wind speed, and humidity. Inaccurate predictions of transmission line temperature can lead to equipment overload, accelerated line aging, or even severe accidents, posing a threat to the stability of the power system [2]. Therefore, developing high-precision and highly reliable transmission line temperature prediction methods is crucial for optimizing power system operations and reducing operational and maintenance costs.

Traditional transmission line temperature prediction methods primarily rely on physical or statistical models [3]. However, these approaches often face limitations when handling multivariate time-series data and complex dynamic environments. Physical models depend heavily on extensive a priori knowledge and assumptions, making them difficult to adapt to changing external conditions [4]. Statistical models, on the other hand, struggle with prediction accuracy when dealing with highly nonlinear and multivariate correlated time series. Moreover, traditional models struggle to balance the modeling requirements of short-term trends and long-term dependencies, leading to predictions that deviate from actual conditions and limiting their applicability to modern power systems [5].

In recent years, with continuous technological advancements and innovations in applications, a variety of advanced transmission line temperature detection methods have emerged. Yun-Qi Hao et al. [6] proposed a transmission line vibration monitoring method based on distributed fiber optic sensing technology, which enables real-time monitoring of the transmission line’s vibration state by measuring the Brillouin frequency shift in the optical fiber. This method enhances the sensing capability of the line’s operational state, playing a crucial role in ensuring the safe operation of the power grid. Kai Chen et al. [7] introduced a temperature monitoring method for transmission lines based on distributed temperature sensors (DTS), which enables real-time online monitoring and fault localization of 10 kV railroad transmission lines by measuring the temperature distribution along the cables via optical fibers. This technique effectively detects abnormal cable heating phenomena. Rui Zhou et al. [8] proposed a reliable monitoring and prediction method for transmission lines that combines fiber Bragg grating (FBG) technology with long short-term memory (LSTM) networks. This method collects micro-meteorological data from transmission lines using FBG technology and integrates it with an optimized machine learning model to achieve accurate temperature predictions. Batista F.V. et al. [9] developed a Hybrid Optoelectronic Sensor (HOCT) for monitoring both the current and temperature of overhead transmission lines. This sensor integrates fiber optic technology and electronics, transmitting energy and signals through optical fibers, thereby enabling the monitoring of wire arc sag in high-voltage transmission lines. The HOCT is designed to be lightweight and uses optical fibers to provide electrical energy and transmit signals, offering strong electromagnetic compatibility and insulating properties. Valentina Cecchi et al. [10] proposed a nonuniform segmentation modeling approach that improves the accuracy of line modeling by incorporating temperature measurements and constructing a non-uniform segmented model to capture the effects of temperature gradients on line parameters. However, a single deep learning model remains limited in its ability to capture both global features and key time points in multivariate time series. Moreover, the performance of deep learning models is highly sensitive to hyperparameter configurations, which often requires significant experience and trial-and-error, leading to inefficient optimization. Therefore, combining deep learning with intelligent optimization techniques presents a promising new approach for transmission line temperature prediction. Table 1 provides a comparative summary of traditional methods, state-of-the-art deep learning models, and the proposed DA-AHA-CNN-LSTM-TPA (DA-AHA-CLT) model. This table highlights the strengths and limitations of each approach, as well as the innovative aspects of the proposed method in addressing multivariate time-series forecasting challenges

The rapid advancement of deep learning techniques has opened new opportunities for time-series forecasting. Convolutional neural networks (CNNs) are particularly effective at capturing short-term patterns in time series by efficiently extracting local features through sliding convolutional kernels. For instance, Shucheng Luo et al. [11] proposed an integrated algorithm combining Stacking with CNN-BiLSTM-Attention and XGBoost for short-term electricity load forecasting. Muhammad Arslan et al. [12] developed a 1D convolutional neural network (1D-CNN)-based Intrusion Detection System for cyber-attack detection in the Industrial Internet of Things (IIoT). This system utilizes the Edge-IIoTset dataset and, through data preprocessing and optimization, designs a lightweight 1D-CNN model with three convolutional and Dropout layers to efficiently classify nine types of cyber-attacks. Alistair Lumazine et al. [13] introduced a hybrid detection method combining a CNN and Isolation Forest for ransomware detection in network traffic. Mudawi N. et al. [14] proposed a 1D-CNN-based gesture recognition system for everyday gesture recognition in healthcare and online education, achieving efficient gesture tracking and classification through video frame preprocessing, background modeling, skeleton mapping, and multi-feature fusion techniques, combined with a particle swarm optimization algorithm. Long short-term memory (LSTM) networks address the long-term dependency issue in traditional recurrent neural networks (RNNs) through a gating mechanism. For example, Kaleem Ullah et al. [15] proposed a hybrid model combining CNNs and LSTMs for short-term load forecasting (STLF). This model uses CNNs to extract spatial features from high-dimensional data and LSTMs to capture time-series characteristics, effectively improving load prediction accuracy. Wang X. et al. [16] proposed an intelligent cache management strategy based on the CNN-LSTM model for adaptive cache management in complex storage systems. This model integrates the spatial feature extraction capabilities of CNNs and the time-series modeling capabilities of LSTMs to accurately predict cache demand and optimize dynamic cache allocation. Shi J. et al. [17] introduced a residual useful life (RUL) prediction method based on exponential smoothing and a dual-attention LSTM lightweight model (DA-LSTM). This method captures sequence degradation features (SDF) using LSTM, integrates a two-layer attention mechanism to aggregate features, and applies exponential smoothing to reduce noise in sensor signals. Limouni T. et al. [18] proposed a hybrid model combining LSTM and a Temporal Convolutional Network (TCN) for very-short-term PV power prediction, where LSTM extracts time-series features, and TCN establishes the connection between the features and the predicted output. However, a single deep learning model remains limited in its ability to capture both global features and key time points in multivariate time series. Moreover, the performance of deep learning models is highly sensitive to hyperparameter configurations, which often requires significant experience and trial-and-error, leading to inefficient optimization. Therefore, combining deep learning with intelligent optimization techniques presents a promising new approach for transmission line temperature prediction.

To address the aforementioned challenges, this paper proposes a deep learning model architecture, DA-AHA-CLT, based on the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA). The model innovatively integrates the strengths of convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and temporal pattern attention (TPA) while optimizing the model’s hyperparameters using the DA-AHA to achieve high-precision transmission line temperature prediction. Specifically, the CNN module extracts short-term pattern features, the LSTM module captures long-term dependencies within the time series, and the TPA module dynamically weights the features of key time segments to enhance global modeling capabilities. Meanwhile, the DA-AHA optimizes the model’s parameter configurations by dynamically adjusting the step size and balancing global exploration with local exploitation, significantly improving both training efficiency and prediction performance.

The model architecture proposed in this paper not only effectively addresses the limitations of traditional prediction methods in handling complex time-series problems, but also comprehensively captures the dynamic change patterns of transmission line temperatures. The experimental results demonstrate that the DA-AHA-CLT model significantly outperforms existing traditional deep learning models (e.g., LSTM, CNN-LSTM) and prediction models combined with other optimization algorithms (e.g., WOA-CLT, NGO-CLT) across several evaluation metrics (e.g., R², RMSE, MAE, MedAE). This research offers innovative solutions for the efficient prediction of complex time-series data and opens new research avenues for integrating intelligent optimization algorithms with deep learning models. Additionally, the paper highlights the following main contributions.

(1): In this paper, we propose an innovative enhancement to the traditional artificial hummingbird algorithm (AHA), introducing the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA). By incorporating dynamic step size and inertia weight adjustments, a greedy local search mechanism, an elite retention strategy, and a grouped parallel search mechanism, we significantly enhance both the global exploration and local exploitation capabilities of the algorithm. These improvements effectively address the issue of traditional optimization algorithms being prone to local optima in high-dimensional complex problems, thereby accelerating convergence and improving optimization accuracy.
(2): For the first time, the DA-AHA is integrated with a deep learning model to construct the DA-AHA-CNN-LSTM-TPA (DA-AHA-CLT) model, which combines convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and the temporal pattern attention (TPA) mechanism. With global hyperparameter optimization enabled by the DA-AHA, the model demonstrates exceptional performance in the task of full-time-step transmission line temperature prediction, significantly enhancing both prediction accuracy and stability.
(3): The DA-AHA-CLT model proposed in this paper effectively captures the short-term fluctuations, long-term trends, and key time-period characteristics of transmission line temperatures. It excels in full-time-step prediction, significantly improving fitting ability and prediction accuracy compared to models that combine traditional methods with other optimization algorithms.

2. Modules and Algorithms

2.1. Implementation of the LSTM Method in the Proposed Method

Long short-term memory (LSTM) networks, a specialized form of recurrent neural networks (RNNs), were introduced by Hochreiter and Schmidhuber in 1997 [19] and later enhanced to create a more efficient architecture. LSTM networks are specifically designed to address the issue of long-term dependencies that traditional RNNs struggle with [20]. In time-series modeling, as the time step increases, the initial input information in an RNN is often overshadowed by subsequent data, which diminishes the model’s ability to learn long-term features. LSTM overcomes this challenge by dynamically storing and selectively forgetting information through the use of gating mechanisms, enabling the model to capture both short-term and long-term dependencies simultaneously [21]. The core structure of an LSTM consists of multiple memory cells, each controlling the flow and storage of information via gating mechanisms, such as the forget gate, input gate, and output gate, effectively addressing the long-term dependency problem [22].

Forget Gate: The forget gate determines which past information needs to be discarded. By processing the last time step hidden state

h_{t - 1}

and the current input

x_{t}

, a value between 0 and 1 is generated for

f_{t}

, which controls how much information is forgotten in the memory cell:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(1)

where

f_{t}

is the forget gate activation,

W_{f}

is the weight matrix, and

b_{f}

is the bias vector for the forget gate.

h_{t - 1}

represents the hidden state from the previous time step, and

x_{t}

is the input vector at the current time step.

σ

is the sigmoid activation function.

Input Gate: The input gate is used to select which new information needs to be added to the memory and to generate candidate memory values:

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(2)

{\tilde{C}}_{t} = \tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(3)

where

i_{t}

is the input gate activation,

W_{i}

is the weight matrix, and

b_{i}

is the bias vector for the input gate.

{\tilde{C}}_{t}

is the candidate cell state,

W_{C}

is the weight matrix, and

b_{C}

is the bias vector for the candidate cell state.

Cell State Update (Cell State Update): The forget gate and input gate work together to update the state of the memory cell:

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ \tilde{C_{t}}

(4)

Here,

f_{t}

determines how much old information is forgotten and

i_{t}

determines how much new information is written.

Output Gate: The output gate controls the output information of the current time step and combines it with the state of the memory cell to generate the hidden state

h_{t}

.

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(5)

h_{t} = o_{t} ⊙ \tanh (C_{t})

(6)

where

o_{t}

is the output gate activation, is the weight matrix, and

b_{o}

is the bias vector for the output gate.

Through the above mechanism, the LSTM unit is able to dynamically control the circulation and storage of information, avoiding the loss of information with an increase in time steps seen in traditional RNNs, as shown in Figure 1.

2.2. Temporal Pattern Attention Mechanisms

The attention mechanism is a crucial technique that has been widely adopted in deep learning in recent years. Its primary purpose is to enable neural network models to dynamically focus on relevant parts of both historical and current input information in sequence prediction tasks, thereby enhancing prediction accuracy and efficiency. In time-series tasks, the temporal pattern attention (TPA) mechanism is a specialized attention method for multivariate time-series data, proposed by Shun-Yao Shih et al. [23] in 2019. TPA effectively captures patterns in time-series data by combining a convolutional neural network with an attention scoring function, which assigns different weights to key time points, thus improving the model’s ability to predict complex time-series data, as shown in Figure 2.

The core workflow of TPA can be categorized into the following steps:

Temporal pattern extraction: A CNN filter is used to extract fixed-length pattern segments of time-series data [24]. Suppose the input time series is

X = [x_{1}, x_{2}, \dots, x_{T}]

. After the convolution operation, a feature sequence

F = [f_{1}, f_{2}, \dots, f_{T}]

is generated. These feature segments represent localized patterns in the time series.

Attentional weight calculation: For each time segment

f_{t}

, the weights (importance) are calculated using the attentional scoring function. The formula for the attention scoring function is as follows:

e_{t} = \tanh (W \cdot f_{t} + b)

(7)

The scores are then standardized into weights by the SoftMax function:

α_{t} = \frac{\exp (e_{t})}{\sum_{i = 1}^{T} \exp (e_{i})}

(8)

where

α_{t}

denotes the weight of the t-th time step.

Weighted aggregate output: Each time segment is weighted and summed according to the weights to generate the final global feature representation

H_{attention} = \sum_{t = 1}^{T} α_{t} f_{t}

(9)

This global feature representation strengthens high-weighted key time point features and weakens low-weighted time point features.

2.3. CNN

Convolutional neural networks (CNNs) are a widely used deep learning architecture for processing spatially structured data, such as images and time-series data. A typical CNN consists of a convolutional layer, a pooling layer, and a fully connected layer [25]. The core design of CNNs leverages convolutional operations and weight-sharing mechanisms to efficiently extract local features from the data while preserving spatial invariance, thereby significantly enhancing both processing efficiency and accuracy.

In time-series forecasting tasks, one-dimensional convolutional neural networks (1D CNNs) capture local patterns and short-term trends by applying sliding convolutional kernels, which, combined with pooling operations, help reduce data complexity and mitigate the risk of overfitting [26]. Specifically, the convolutional layer identifies key change patterns in the time series by extracting features within a defined time window, while the pooling layer retains important features and reduces redundant information through dimensionality reduction operations (e.g., maximum pooling or average pooling). Additionally, the features processed by the convolutional and pooling layers are passed to the fully connected layer for modeling and final prediction of the overall trend.

In this study, a 1D convolutional neural network (CNN) is employed to analyze and forecast time-series data. With a specially designed convolutional kernel, the network effectively captures short-term features in the time series. The parameter-sharing mechanism of the convolutional kernel significantly reduces model complexity and enhances computational efficiency. This approach not only enables the network to quickly adapt to large-scale data but also improves its sensitivity to rapidly changing patterns in the time series. Ultimately, the structure of the 1D CNN makes it highly flexible and scalable for time-series prediction, offering an efficient solution for handling complex dynamic patterns, as shown in Figure 3.

2.4. Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA)

The artificial hummingbird algorithm (AHA) is an emerging population-based optimization algorithm inspired by the foraging behavior of hummingbirds in nature [27]. Hummingbirds employ unique foraging strategies, including migratory foraging, guided foraging, and territorial foraging, which enable them to efficiently search for optimal food sources in complex environments. By simulating the flexible movement strategies and foraging patterns of hummingbirds, the AHA aims to address global optimization problems, particularly those that are high-dimensional, multi-peak, and nonlinear. However, the original AHA still has limitations in search efficiency and local search capability, particularly in its tendency to fall into local optima in later stages and its need for improved search accuracy.

Description of improvements

Dynamic step size and inertia weight adjustment: By dynamically updating the step size and inertia weight, the hummingbird has a larger search range in the early stage of global search, and gradually converges to local search in the later stage, which balances the exploration and exploitation capabilities in the search process.
Greedy local search mechanism: After executing the foraging strategy, a local search mechanism is introduced for the current solution, which selects a better solution by fine-tuning the comparison between the current position and the neighboring positions, thus effectively improving the convergence accuracy and local optimization ability of the algorithm.
Elite retention mechanism: In each iteration, the current global optimal solution is saved and passed to the next generation to prevent the degradation of the solution, ensure the stability of the optimal solution, and further improve the global search effect of the algorithm.
Grouped parallel search mechanism: the hummingbird population is divided into multiple groups. Each group searches independently in the local area, and the individuals in the group converge to the optimal solution in the group. This mechanism expands the search space and improves the diversity of solutions and the overall efficiency of the algorithm.

Through the above improvements, the DA-AHA can better balance the relationship between global search and local search, significantly improve the search efficiency, optimization accuracy, and stability of the algorithm, so that it shows stronger robustness and convergence performance in solving high-dimensional, multi-peak complex optimization problems.

2.4.1. Initialization Phase

The initialization phase of the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA) serves as the foundation for its optimization process. In this phase, the positions of N hummingbirds are randomly initialized within the predefined search space, representing potential solutions to the optimization problem. Each hummingbird’s position, denoted as

x_{i, j}

, is determined using a uniform distribution within the boundaries of the search space, expressed as

x_{i, j} (0) = x_{m i n, j} + r a n d (0,1) \cdot (x_{m a x, j} - x_{m i n, j})

(10)

where

x_{m i n, j}

and

x_{m a x, j}

are the lower and upper bounds of the j-th dimension, respectively, and rand(0,1) is a random number between 0 and 1.

Once initialized, the fitness values of all hummingbirds are calculated based on the defined objective function, which evaluates the quality of each potential solution. Additionally, key parameters, including the step range (b), inertia weights (w), and maximum number of iterations (MaxIter), are defined to guide the algorithm’s progression. This phase ensures that the algorithm starts with a diverse population, which is crucial for robust exploration of the solution space.

2.4.2. Dynamic Step Size and Inertia Weight Adjustment

To enhance the balance between global exploration and local exploitation, the DA-AHA introduces dynamic step size and inertia weight adjustment mechanisms. These parameters, which adapt based on the current iteration, play a pivotal role in controlling the movement of hummingbirds within the search space.

The step size, denoted as b(t), starts with a relatively large value to facilitate broad global exploration in the early stages of optimization. Over time, b(t) gradually decreases, allowing the hummingbirds to focus on local exploitation during the later iterations. This adjustment is mathematically defined as

b (t) = b_{m a x} - \frac{(b_{m a x} - b_{m i n}) \cdot t}{M a x I t e r}

(11)

where

b_{m a x}

and

b_{m i n}

are the initial and final step sizes, t is the current iteration, and MaxIter is the maximum number of iterations.

Similarly, the inertia weight w(t) adjusts dynamically to balance exploration and exploitation, decreasing as the algorithm progresses:

w (t) = w_{m a x} - \frac{(w_{m a x} - w_{m i n}) \cdot t}{M a x I t e r}

(12)

This mechanism ensures that the hummingbirds have greater freedom to explore diverse regions of the solution space initially, while gradually honing in on promising solutions with increased precision.

2.4.3. Foraging Strategy Selection

Hummingbirds in the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA): The selection of foraging strategies plays a critical role in balancing global exploration and local exploitation. Hummingbirds dynamically choose between two distinct strategies—guided foraging and territorial foraging—based on the parameters

a (t)

and w(t), which adapt over the course of iterations.

In guided foraging, the hummingbird directs its movement toward the current global optimal position, utilizing information from the best solution found so far. This approach allows the hummingbird to refine its search in promising regions of the solution space. The movement is mathematically expressed as

x_{i, j} (t + 1) = x_{i, j} (t) + β \cdot (g_{b e s t, j} (t) - x_{i, j} (t))

(13)

where

g_{b e s t, j} (t)

represents the globally optimal solution at the current iteration, and

β

is a perturbation factor that introduces controlled variability to prevent premature convergence. This strategy is particularly effective in exploiting high-quality solutions by honing in on their vicinity.

On the other hand, territorial foraging focuses on exploring the local area around the hummingbird’s current position. By adding random perturbations to the position, the hummingbird investigates nearby regions to identify potential improvements. This behavior is modeled as

x_{i, j} (t + 1) = x_{i, j} (t) + b \cdot r a n d (- 1, 1)

(14)

where

b

represents the local search step size, and the random function

r a n d (- 1, 1)

introduces stochasticity. Territorial foraging enhances the algorithm’s ability to avoid becoming trapped in local optima by thoroughly examining the immediate neighborhood of the current solution.

By dynamically alternating between these two strategies, the DA-AHA ensures a balanced approach to optimization, effectively combining global search to locate promising regions and local search to refine solutions with precision.

2.4.4. Greedy Local Search Mechanism

The greedy local search mechanism is a refinement step that enhances the accuracy and efficiency of the DA-AHA. After executing the primary foraging strategy, each hummingbird performs a localized fine-tuning search around its current position. This process involves generating a neighboring candidate solution, denoted as

x_{i, j}^{'}

, by introducing small perturbations:

x_{i, j}^{'} = x_{i, j} + r a n d (- 1, 1) \cdot λ

(15)

where

λ

is a scaling factor that controls the magnitude of the local perturbation.

The fitness of the new candidate solution is then compared with that of the current position. If the new position offers an improved fitness value, it is retained as the updated solution:

x_{i, j} = \{\begin{matrix} x_{i, j}^{'}, & i f f (x_{i, j}^{'}) < f (x_{i, j}), \\ x_{i, j}, & o t h e r w i s e . \end{matrix}

(16)

This greedy approach ensures that each hummingbird iteratively converges toward better solutions, thereby improving the overall performance of the algorithm.

2.4.5. Elite Retention Mechanisms

The elite retention mechanism is designed to preserve the best solutions identified during the optimization process. At each iteration, the global best solution

g_{b e s t, j}

is retained and carried forward to the next generation. This ensures that the optimization process does not lose high-quality solutions due to stochastic perturbations or local search steps.

By safeguarding the global optimum, the algorithm maintains its stability and robustness, particularly in later iterations when local exploitation becomes dominant. This mechanism acts as a safeguard, preventing solution degradation and ensuring consistent convergence toward the global optimum.

2.4.6. Grouped Parallel Search Mechanisms

To further enhance the algorithm’s efficiency and diversity, the DA-AHA employs a grouped parallel search mechanism. In this approach, the population of hummingbirds is divided into multiple groups, each searching independently within localized regions of the solution space.

Within each group, hummingbirds collaborate to identify the best local solution, denoted as

g_{l o c a l}

. The individuals in the group adjust their positions based on

g_{l o c a l}

, ensuring convergence within the subgroup:

x_{i, j} = x_{i, j} + α \cdot (g_{l o c a l, j} - x_{i, j})

(17)

where

α

is a convergence factor.

The grouped parallel search mechanism not only improves the exploration of diverse regions, but also mitigates the risk of premature convergence by maintaining population diversity. This collaborative yet independent search strategy expands the algorithm’s ability to explore and exploit the solution space efficiently.

2.4.7. Termination Conditions

The termination of the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA) is governed by predefined conditions that ensure the optimization process concludes when an optimal or satisfactory solution is reached. These conditions are essential for balancing computational efficiency and solution quality.

The algorithm terminates under one of two circumstances. The first condition is reaching the maximum number of iterations, MaxIter. This ensures that the algorithm has a fixed computational budget, preventing excessive runtime and resource usage. By capping the number of iterations, the optimization process remains computationally tractable, particularly for high-dimensional or complex problems.

The second condition is based on the convergence of the global best solution,

g_{b e s t}

. Specifically, if the change in fitness value between successive iterations falls below a predefined threshold,

ϵ

, the algorithm assumes that it has reached an optimal or near-optimal solution and halts. This criterion ensures that the algorithm does not continue searching once meaningful improvements are no longer achievable:

| f (g_{b e s t}^{(t)}) - f (g_{b e s t}^{(t - 1)}) | < ϵ

(18)

where

f (g_{b e s t}^{(t)})

and

f (g_{b e s t}^{(t - 1)})

are the fitness values of the global best solution at iterations t and t − 1, respectively.

By combining these two termination conditions, the DA-AHA achieves a balance between exploration and computational efficiency. The maximum iteration limit prevents unnecessary computations, while the convergence criterion ensures that the algorithm halts only when further improvements are negligible. Together, these termination conditions guarantee that the optimization process is both efficient and effective.

As shown in Figure 4, in the DA-AHA optimization section on the left side of the figure, the algorithm first establishes the initial search space by initializing the individual hummingbird, the location of the food source, and the fitness value. Subsequently, dynamic step size and inertia weights are introduced, which are adjusted according to the current iteration stage, giving the hummingbird a larger step size in the early stage to strengthen the global search, and gradually decreasing the step size in the later stage to enhance the local exploitation capability. The algorithm triggers corresponding foraging strategies based on different conditions: if the migration condition is satisfied, global exploration is performed and the food source is regenerated; if not, local exploitation is executed through guided foraging (approaching to the global optimal solution) and territorial foraging (local search around the current region). After the foraging strategy is executed, a greedy local search mechanism is further introduced to fine-tune the optimization of the current location and select a better solution through neighborhood search. During each iteration, the elite retention mechanism is used to preserve the current global optimal solution to ensure that the solution is not degraded; at the same time, the group parallel search mechanism divides the individual hummingbirds into multiple groups, and each group independently performs a local search to further improve the search efficiency and diversity of solutions. The whole iteration process continues until the maximum number of iterations is reached or the solution converges, and the global optimal solution is finally output.

In the deep learning training section on the right side of the figure, the dataset is first processed and divided into a training set and a test set. Subsequently, a deep learning model is constructed, and the model is trained using the training data. After training, the model is predicted from the test data, and the accuracy and error of the prediction results are calculated to evaluate the performance of the model. The DA-AHA plays a key role in this process by optimizing the selection of hyperparameters and the initial conditions of the model to provide a better starting point and parameter combination for deep learning training. With the advantages of the DA-AHA in global search and local development, the deep learning model can improve the training efficiency in complex optimization scenarios, and significantly improve the prediction accuracy and stability.

2.5. Whale Optimization Algorithm

The Whale Optimization Algorithm (WOA) is a population-based optimization algorithm inspired by the humpback whale’s bubble-net feeding behavior [28]. It solves complex optimization problems by simulating the encircling and spiral motion around prey, as well as the global search behavior during the hunting process of whales. In this algorithm, the optimal individual in the population is treated as the prey, and the remaining individuals adjust their positions to approach the optimal solution. They randomly select either the encircling circle or spiral movement pattern with a certain probability to perform a local search, while simultaneously utilizing the global search mechanism to avoid becoming trapped in local optima. The WOA strikes a balance between global exploration and local exploitation through dynamic parameter adjustment, offering advantages such as simplicity, ease of implementation, strong adaptability, and fast convergence. It has been widely applied in complex function optimization, engineering design, data mining, path planning, and machine learning, making it an efficient and robust intelligent optimization method.

2.6. Northern Goshawk Optimization

The Northern Goshawk Optimization (NGO) algorithm is a population-based optimization technique inspired by the ecological characteristics of northern goshawk hunting behavior [29]. Northern goshawks are renowned for their efficient and precise hunting skills, demonstrating strong searching abilities and collaborative traits as they find prey in complex environments using diverse hunting strategies. The algorithm simulates two core behaviors of the hawk during hunting: first, locating the prey through extensive searching (global exploration), and second, approaching the prey using high-speed flight combined with a precise pouncing strategy (local exploitation). During the optimization process, individuals dynamically adjust their positions in the search space to converge toward the optimal solution through collaboration. NGO is designed to have strong global exploration and local exploitation capabilities, and by iterating in the search space, the algorithm converges to the globally optimal solution while avoiding local optima. With its flexibility and efficiency, the algorithm performs excellently in function optimization, engineering design, path planning, and other fields. It is particularly suitable for high-dimensional, nonlinear, and multimodal optimization problems.

2.7. Particle Swarm Optimization

Particle swarm optimization (PSO) is a global optimization algorithm based on swarm intelligence [30], inspired by the foraging behaviors of a flock of birds and the motion of a school of fish. PSO finds the optimal solution by representing each candidate solution as a particle and updating the position and velocity of each particle to simulate its movement in the search space. Each particle is guided by its own historical best position (pBest) and the global best position (gBest), while dynamically adjusting the inertia weights to balance global search and local exploitation. The PSO algorithm is known for its simplicity, ease of implementation, and high computational efficiency, making it widely used in function optimization, neural network training, path planning, and parameter tuning. However, its tendency to fall into local optima can be mitigated by introducing dynamic weights, chaotic search strategies, or combining it with other optimization algorithms to further enhance its performance.

2.8. Transmission Line Temperature Prediction Model

First, the target parameters for optimization, including the hyperparameters of the model and the network structure parameters, need to be defined to ensure that the experimental results are optimal. Next, the fitness function is defined for evaluating the performance of the model under specific parameter configurations. A set of parameter configurations is initialized by generating random combinations within a preset range, and the model is trained and its performance analyzed against these parameter configurations to compute the fitness value for each potential solution. This process aims to systematically explore the parameter space to find the optimal parameter configurations that maximize model performance.

This paper proposes a deep learning model architecture that combines convolutional neural networks (CNNs), long short-term memory (LSTM) networks, the temporal pattern attention (TPA) mechanism, and the Dynamic Adaptive Hummingbird Algorithm (DA-AHA) to address the task of predicting transmission line temperatures. The model effectively integrates the feature extraction and time-series modeling capabilities of deep learning with the parameter tuning strengths of intelligent optimization algorithms. It is capable of capturing short-term patterns, long-term dependencies, and key time-period features in the time series, thereby enabling efficient and accurate prediction of transmission line temperatures.

First, the input data consist of one-dimensional time-series data recorded during transmission line operation, including features such as current, voltage, ambient temperature, wind speed, humidity, and other factors related to line temperature. These data are fed into the CNN module for initial processing, where the CNN extracts local features using a sliding convolutional kernel to capture patterns and trends within short time windows, such as the relationship between temperature and wind speed over a specific period. The parameter-sharing mechanism of the convolutional kernel reduces the model’s computational complexity while enhancing its generalization ability and robustness in feature extraction. After further dimensionality reduction through pooling operations, the CNN layer outputs a set of concise and efficient local features, providing high-quality input for subsequent time-series modeling.

Subsequently, the features extracted by the CNN are passed to the LSTM module. The LSTM models the long-term dependencies of the time series by utilizing its unique gating mechanisms including forget gates, input gates, and output gates. Specifically, the forget gate is responsible for filtering and retaining important historical information, the input gate integrates new local features into the memory cell, and the output gate generates a hidden state output for the current time step based on the state of the memory cell. Through this series of operations, the LSTM layer is able to capture dynamic correlations in transmission line temperature changes, such as the cumulative response of line temperature to environmental variables over multiple time steps. The hidden and memorized states of the LSTM outputs further enhance the ability to express global features of the sequence data.

Building on this, the output features from the LSTM are passed to the TPA module, which dynamically assigns weights to key time segments in the time-series data through its attention mechanism. The TPA first uses a fixed-length convolutional kernel to extract patterns from the input sequences, identifying features within fixed-length time windows. It then calculates the importance of each time segment using attention weights and dynamically adjusts its contribution to the final feature representation, thereby emphasizing the most critical time segments for temperature prediction. The TPA module significantly enhances the model’s flexibility and accuracy in global modeling.

Finally, the model’s parameter optimization is performed using the Dynamic Adaptive Hummingbird Algorithm (DA-AHA). The DA-AHA simulates the foraging behavior of hummingbirds, achieving a balance between global search and local exploitation through dynamic step size adjustment. It expands the solution space by incorporating grouped parallel searches to optimize the hyperparameters (such as the size of the convolution kernel, learning rate, and number of hidden units) of the CNN, LSTM, and TPA modules, ensuring the model’s optimal performance. The optimized features are then integrated through the fully connected (FC) layer, and the output layer generates the final prediction of the transmission line temperature, as shown in Figure 5.

3. Experimental Section

In this study, the proposed model, the DA-AHA-CNN-LSTM-TPA (DA-AHA-CLT), is evaluated using a region-specific dataset of transmission line temperatures. The dataset incudes transmission line operating parameters from 00:00 on 1 January 2024 to 00:00 on 1 December 2024, containing a total of 8041 records, with data intervals of one hour. It includes a broad range of parameters related to transmission lines, such as voltage (in kV), ambient temperature (in °C), wire type, tower height (in meters), wind speed (in m/s), wind direction (in °), and transmission line temperature (in °C). These parameters account for both transmission line operating conditions and external environmental factors, which influence temperature changes. As a result, the dataset enhances the model’s ability to accurately predict transmission line temperatures under varying weather and load conditions. Detailed information about the time range, data volume, scale, and environmental conditions under which the model was tested is provided for each dataset, as shown in Table 2 and Table 3.

3.1. Data Processing

In the data processing, the dataset was normalized in order to remove outliers and fill in missing values while ensuring data consistency. The specific processing of the first 200 data points is described below.

First, the raw transmission line temperature data were visualized, revealing some anomalies, which may have originated from noise or measurement errors. During the data cleaning process, all temperature values exceeding the ±10 °C range were identified as outliers and treated as missing values, as shown in Figure 6. To fill in these missing values and maintain data continuity, they were filled in by interpolating them with the average of the neighboring time points. This interpolation method smooths out data fluctuations and avoids abrupt changes, while more accurately reflecting the actual temperature trend.

3.2. Target Functions and Evaluation Indicators

In this experiment, the model’s performance was evaluated using various metrics, including the coefficient of determination (R²), root-mean-square error (RMSE), mean absolute error (MAE), and median absolute error (MedAE). These metrics provide insights into the model’s predictive accuracy and stability from different perspectives, enabling a comprehensive assessment of its performance. Specifically, R² measures how well the model fits the data, with a value closer to 1 indicating a better explanation of the data’s variance. RMSE quantifies the magnitude of the difference between predicted and actual values, where a smaller value indicates higher prediction accuracy. MAE calculates the average absolute error between predicted and actual values, with a lower MAE indicating fewer overall errors. MedAE, which reflects the model’s accuracy and stability through the median absolute error, is particularly useful for evaluating robustness since it is less sensitive to outliers. By combining these indicators, the model’s strengths and weaknesses can be comprehensively assessed, providing a solid foundation for subsequent optimization.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}{\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}}

(19)

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(20)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}

(21)

M e d A E = m e d i a n (| y_{i} - {\hat{y}}_{i} |)

(22)

The training and testing process of the transmission line temperature prediction model proposed in this paper is illustrated in Figure 7. The dataset is first split into training and testing sets in an 80:20 ratio, and the data are normalized to ensure consistency in the scale of the features. During the training phase, the Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA) generates new candidate solutions by dynamically adjusting the step size and inertia weights, while incorporating a grouped parallel search mechanism to enhance optimization efficiency. The fitness function is then used to evaluate the fitness value of each candidate solution, driving both global search and local exploitation to determine the optimal parameter configuration for the model.

During the optimization process, DA-AHA dynamically updates the search strategy based on the current iteration stage. For example, it employs guided foraging (approaching the global optimal solution) and territorial foraging (local search around the current region) to efficiently optimize the model’s hyperparameters. If the maximum number of iterations is reached or the fitness value meets a specified threshold, the optimization terminates, and the optimal parameter configuration for the CLT model is output.

3.3. Ablation Experiment

In this section, the prediction performance of the DA-AHA-CLT model is evaluated by comparing it with other popular models for transmission line temperature prediction. First, the DA-AHA-CLT model is compared with LSTM, CNN-LSTM, and LSTM-TPA models across the full time span to verify its predictive advantage. The training details for each model are provided in the table. The evaluation includes different time periods as well as peak load conditions to comprehensively capture the cyclical nature of transmission line temperature variations in response to the external environment.

In this assessment, the full time step is used for prediction. The full time step is effective in capturing both diurnal and long-term trends in transmission line temperatures, particularly in accounting for slow-acting factors such as ambient temperature, wind speed, humidity, and voltage loading. By utilizing the full time step, the model not only comprehensively captures the overall trend in temperature changes, but also minimizes the accumulation of errors typically associated with short-term predictions. The experimental results show that the DA-AHA-CLT model significantly outperforms traditional LSTM and CNN-LSTM models in capturing the long-term change patterns and complex dynamic characteristics of transmission line temperature. Specifically, it demonstrates higher prediction accuracy, stability, and reliability, especially in long-term temperature forecasting, as shown in Table 4.

Table 4 presents the RMSE, MAE, R², and MedAE metrics for the CLT, LSTM, CNN-LSTM, and LSTM-TPA models across the entire time series. Higher R² values and lower RMSE and MAE values indicate better prediction accuracy, while lower MedAE values reflect the models’ robustness in handling data fluctuations. This is shown in Table 5 and Figure 8.

In this ablation experiment, four models (CLT, LSTM-TPA, CNN-LSTM, LSTM, ARIMA and SARIMA) were evaluated to compare their performance using the metrics of R², RMSE, MAE, and MedAE. The experimental results show that the CLT model outperforms the other models across all metrics.

In terms of the goodness-of-fit index R², the CLT model achieves an R² value of 0.878, significantly higher than the other models, indicating that it more accurately captures the data trend. In comparison, the LSTM-TPA and CNN-LSTM models have R² values of 0.798 and 0.773, respectively, which are somewhat less effective in fitting the data. The LSTM model, with an R² value of only 0.704, shows the poorest performance, highlighting its limited ability to explain the data.

In terms of error metrics, the CLT model exhibits the lowest error values, with a root-mean-square error (RMSE) of 0.92, a mean absolute error (MAE) of 0.87, and a median absolute error (MedAE) of 0.71. This indicates that the CLT model’s predictions have the least deviation from the true values, demonstrating strong performance in both overall error (MAE and RMSE) and typical error (MedAE). The LSTM-TPA model follows closely, but its RMSE, MAE, and MedAE increase to 1.07, 0.94, and 0.81, respectively, indicating a slight increase in prediction error compared to the CLT model. The CNN-LSTM model shows a further widening of errors, with RMSE, MAE, and MedAE reaching 1.13, 0.97, and 0.78, respectively. Although its performance is similar to that of LSTM-TPA, there are still some discrepancies in its predictions. The LSTM model performs the worst, with an RMSE of 1.59, an MAE of 1.32, and a MedAE of 0.91. These significantly higher prediction errors suggest that the model’s performance is substantially degraded after the removal of key modules.

From the overall analysis, it can be seen that the CLT model has the best performance in this ablation experiment, an is significantly better than the other models in terms of fitting ability and error control. This indicates that the components of ablation have a greater impact on the performance of other models, while the CLT model still maintains strong stability and accuracy under the ablation experiment, and is the model with the best performance in this experiment.

The DA-AHA is employed in the DA-AHA-CLT model, which optimizes four key parameters of the CLT: the network unit, the regularization term, the learning rate, and an unspecified parameter. Prior to the optimization process, the DA-AHA training determines the initial parameters. The fitness of the individuals is updated according to a policy that directs the search towards the global optimum. Figure 9 illustrates the optimization process after 40 iterations, and Table 6 provides the relevant parameters of the DA-AHA. Figure 10 shows the changes in training and testing loss over 100 iterations.

Figure 9 illustrates the convergence process of the fitness value with respect to the number of iterations based on the new Hummingbird algorithm. The fitness value rapidly decreases from 0.013 to approximately 0.009 within the first five iterations, indicating that the algorithm performs a fast global search through dynamic step size adjustment in the initial stage. Subsequently, the fitness value decreases gradually in a stepwise manner between the 6th and 20th iterations, reflecting the algorithm’s combination of guided foraging and local fine-tuning mechanisms to optimize the solution. After 20 iterations, the fitness value stabilizes at around 0.007, suggesting that the algorithm effectively completes local exploitation through elite retention and grouped parallel search mechanisms, ultimately achieving fast convergence. Overall, the DA-AHA achieves a good balance between global exploration and local development, demonstrating strong optimization efficiency and robustness.

Figure 10 shows how the loss of the model optimized with the new Hummingbird algorithm changes during the training and testing phases. As can be seen from the figure, the loss value decreases rapidly in the early stage of training, from about 0.7 to close to 0.02 within 20 epochs, and then stabilizes and stays at a low level (less than 0.02). The trends of training loss and testing loss are almost identical, indicating that the model has good generalization ability and consistency. This suggests that the DA-AHA not only optimizes the hyperparameters of the model through dynamic step-size adjustment with efficient global search and local development mechanisms, but also significantly accelerates the convergence process of the model, and improves the prediction performance and robustness at the same time.

Next, this paper compares the DA-AHA-CLT prediction model with other models that combine different optimization algorithms, including the Whale Optimization Algorithm (WOA), Northern Goshawk Optimization Algorithm (NGO), particle swarm optimization (PSO) algorithm, and traditional CLT model. These comparisons aim to further validate the effectiveness and superiority of the DA-AHA-CLT model in arc depression prediction. This is shown in Table 7 and Figure 11.

For full-time-step prediction, the DA-AHA-CNN-LSTM-TPA (DA-AHA-CLT) model demonstrates excellent performance, achieving an R² score of 0.987. Compared to the traditional models, the DA-AHA-CLT model outperforms the CLT by 0.109, the LSTM by 0.283, and the CNN-LSTM by 0.214 in terms of R² score. These results highlight the significant deficiencies in the fitting ability of traditional models, while the DA-AHA substantially enhances the model’s capacity to fit full-time-step sequences through the optimization of the structure and parameters of the CNN-LSTM-TPA architecture.

Compared with the current state-of-the-art optimization models, the R² score of the DA- AHA -CLT model is 0.023 higher than that of the WOA-CLT model and 0.016 higher than that of the NGO-CLT model. In addition, in terms of the error metrics, the DA-AHA -CLT model exhibits the lowest values of the RMSE (root-mean-square error), the MAE (mean absolute error), and the MedAE (median absolute error), respectively. All of them show the lowest values of 0.023, 0.018, and 0.011, respectively, which indicates that the DA-AHA-CLT model not only leads in the degree of fitting, but also significantly outperforms other optimization models in terms of the robustness and accuracy of its prediction. Since the prediction results of the full time step contain a large number of data points, it may be difficult to fully reflect the prediction performance of the model by presenting them directly. Therefore, in this paper, 150 datapoints were selected for visualization. This is shown in Figure 12 and Figure 13.

4. Conclusions

In this paper, we propose a CNN-LSTM-TPA model (DA-AHA-CLT) based on the Dynamic Adaptive Hummingbird Algorithm (DA-AHA) to address the challenge of the full-time-step prediction of transmission line temperature and evaluate its performance in detail. The DA-AHA-CLT model achieves remarkable results in the full-time-step prediction task, with an R² score of 0.987, significantly outperforming conventional models such as CLT, LSTM, and CNN-LSTM by 0.109, 0.283, and 0.214, respectively. This indicates that conventional models struggle to capture the global features of the time series, while the DA-AHA-CLT model enhances fitting accuracy by optimizing the structure and parameters of the CNN-LSTM-TPA framework. Compared to current state-of-the-art optimization models, the R² scores of the DA-AHA-CLT model are 0.023 and 0.016 higher than those of WOA-CLT and NGO-CLT, respectively. Moreover, the DA-AHA-CLT model exhibits the lowest values in RMSE, MAE, and MedAE error metrics, with values of 0.023, 0.018, and 0.011, respectively. These results indicate that the DA-AHA not only significantly improves the prediction accuracy, but also enhances the model’s adaptability and robustness to complex dynamic characteristics. The experimental results further verify the superiority of the DA-AHA-CLT model in the full-time-step prediction of transmission line temperature. The model successfully achieves a balance between global search and local development through dynamic step size adjustment and grouped parallel search in the DA-AHA, significantly improving training efficiency and predictive capability. Furthermore, the model demonstrates consistent performance across both the training and testing phases, confirming its strong generalization ability and stability.

Despite these promising results, several limitations of this study should be acknowledged. The model heavily relies on high-quality multivariate time-series data, which may not always be available or complete in real-world applications. Additionally, the dataset used in this study is limited to a single region and time period, leaving the generalizability of the model to other regions or diverse environmental conditions untested. Furthermore, the DA-AHA-CLT model requires substantial computational resources due to its complexity, which may limit its deployment in resource-constrained environments. Another limitation lies in the dependency on future environmental variables (e.g., temperature, wind speed) as input features, which often require external forecasting models and may introduce additional uncertainty.

Future research could address these limitations by exploring the integration of real-time data collection mechanisms, testing the model across diverse datasets and regions, and developing lightweight versions of the model to reduce computational requirements. Moreover, integrating additional optimization algorithms or further enhancing the structure of the CNN-LSTM-TPA framework could improve its adaptability and scalability. This innovative optimization method and framework provide valuable technical support for solving complex time-series forecasting problems while offering opportunities for improvement and application in other domains.

Author Contributions

Conceptualization, X.J. and C.L.; methodology, X.J.; software, C.L.; validation, X.J., C.L. and B.X.; formal analysis, C.L.; investigation, X.J.; resources, C.L.; data curation, X.J.; writing—original draft preparation, X.J.; writing—review and editing, C.L.; visualization, H.H.; supervision, M.L.; project administration, H.H.; funding acquisition, B.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was Supported by the Technology Project of State Grid Co., LTD. (SGJLCG00YJJS2400152).

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to confidentiality reasons related to laboratory data.

Conflicts of Interest

Author Beimin Xie was employed by the State Grid Jilin Electric Power Co., LTD., Ultra-High Voltage Company. The remaining authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest. The authors declare that this study received funding from the State Grid Jilin Electric Power Co., LTD. Ultra-High Voltage Company. The funder was not involved in the study design; in the collection, analysis, or interpretation of the data; in the writing of this article; or in the decision to submit it for publication.

References

Alhamrouni, I.; Kahar, N.H.A.; Salem, M.; Swadi, M.; Zahroui, Y.; Kadhim, D.J.; Mohamed, F.A.; Nazari, M.A. A comprehensive review on the role of artificial intelligence in power system stability, control, and protection: Insights and future directions. Appl. Sci. 2024, 14, 6214. [Google Scholar] [CrossRef]
Zainuddin, N.M.; Rahman, M.A.; Kadir, M.A.; Ali, N.N.; Ali, Z.; Osman, M.; Nasir, N.M. Review of thermal stress and condition monitoring technologies for overhead transmission lines: Issues and challenges. IEEE Access 2020, 8, 120053–120081. [Google Scholar] [CrossRef]
Paldino, G.M.; De Caro, F.; De Stefani, J.; Vaccaro, A.; Bontempi, G. Transfer learning-based methodologies for Dynamic Thermal Rating of transmission lines. Electr. Power Syst. Res. 2024, 229, 110206. [Google Scholar] [CrossRef]
Yin, Y.; Le Guen, V.; Dona, J.; de Bézenac, E.; Ayed, I.; Thome, N.; Gallinari, P. Augmenting physical models with deep networks for complex dynamics forecasting. J. Stat. Mech. Theory Exp. 2021, 2021, 124012. [Google Scholar] [CrossRef]
Qiu, H.; Gu, W.; Ning, C.; Lu, X.; Liu, P.; Wu, Z. Multistage mixed-integer robust optimization for power grid scheduling: An efficient reformulation algorithm. IEEE Trans. Sustain. Energy 2022, 14, 254–271. [Google Scholar] [CrossRef]
Hao, Y.-Q.; Cao, Y.-L.; Ye, Q.; Cai, H.-W.; Qu, R.-H. On-line temperature monitoring in power transmission lines based on Brillouin optical time domain reflectometry. Opt.-Int. J. Light Electron Opt. 2015, 126, 2180–2183. [Google Scholar] [CrossRef]
Chen, K.; Yue, Y.; Tang, Y. Research on temperature monitoring method of cable on 10 kV railway power transmission lines based on distributed temperature sensor. Energies 2021, 14, 3705. [Google Scholar] [CrossRef]
Zhou, R.; Zhang, Z.; Zhang, H.; Cai, S.; Zhang, W.; Fan, A.; Xiao, Z.; Li, L. Reliable monitoring and prediction method for transmission lines based on FBG and LSTM. Adv. Eng. Inform. 2024, 62, 102603. [Google Scholar] [CrossRef]
de Nazare, F.V.B.; Werneck, M.M. Hybrid optoelectronic sensor for current and temperature monitoring in overhead transmission lines. IEEE Sens. J. 2011, 12, 1193–1194. [Google Scholar] [CrossRef]
Valentina, C.; St, L.A.; Karen, M. Incorporating temperature variations into transmission-line models. IEEE Trans. Power Deliv. 2011, 26, 2189–2196. [Google Scholar]
Luo, S.; Wang, B.; Gao, Q.; Wang, Y.; Pang, X. Stacking integration algorithm based on CNN-BiLSTM-Attention with XGBoost for short-term electricity load forecasting. Energy Rep. 2024, 12, 2676–2689. [Google Scholar] [CrossRef]
Arsalan, M.; Mubeen, M.; Bilal, M.; Abbasi, S.F. 1D-CNN-IDS: 1D CNN-based intrusion detection system for IIoT. In Proceedings of the 2024 29th International Conference on Automation and Computing (ICAC), Sunderland, UK, 28–30 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–4. [Google Scholar]
Lumazine, A.; Drakos, G.; Salvatore, M.; Armand, V.; Andros, B.; Castiglione, R.; Grigorescu, E. Ransomware Detection in Network Traffic Using a Hybrid Cnn and Isolation Forest Approach; Sage Publishing: Thousand Oaks, CA, USA, 2024. [Google Scholar]
Al Mudawi, N.; Ansar, H.; Alazeb, A.; Aljuaid, H.; AlQahtani, Y.; Algarni, A.; Jalal, A.; Liu, H. Innovative healthcare solutions: Robust hand gesture recognition of daily life routines using 1D CNN. Front. Bioeng. Biotechnol. 2024, 12, 1401803. [Google Scholar] [CrossRef]
Ullah, K.; Ahsan, M.; Hasanat, S.M.; Haris, M.; Yousaf, H.; Raza, S.F.; Tandon, R.; Abid, S.; Ullah, Z. Short-Term Load Forecasting: A Comprehensive Review and Simulation Study With CNN-LSTM Hybrids Approach. IEEE Access 2024, 12, 111858–111881. [Google Scholar] [CrossRef]
Wang, X.; Li, X.; Wang, L.; Ruan, T.; Li, P. Adaptive Cache Management for Complex Storage Systems Using CNN-LSTM-Based Spatiotemporal Prediction. arXiv 2024, arXiv:2411.12161. [Google Scholar]
Shi, J.; Zhong, J.; Zhang, Y.; Xiao, B.; Xiao, L.; Zheng, Y. A dual attention LSTM lightweight model based on exponential smoothing for remaining useful life prediction. Reliab. Eng. Syst. Saf. 2024, 243, 109821. [Google Scholar] [CrossRef]
Limouni, T.; Yaagoubi, R.; Bouziane, K.; Guissi, K.; Baali, E.H. Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model. Renew. Energy 2023, 205, 1010–1024. [Google Scholar] [CrossRef]
Hochreiter, S. Long Short-Term Memory; Neural Computation MIT-Press: Cambridge, MA, USA, 1997. [Google Scholar]
Lin, T.; Horne, B.G.; Giles, C. How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies. Neural Netw. 1998, 11, 861–868. [Google Scholar] [CrossRef]
Malashin, I.; Tynchenko, V.; Gantimurov, A.; Nelyub, V.; Borodulin, A. Applications of Long Short-Term Memory (LSTM) Networks in Polymeric Sciences: A Review. Polymers 2024, 16, 2607. [Google Scholar] [CrossRef]
Cavus, M.; Ugurluoglu, Y.F.; Ayan, H.; Allahham, A.; Adhikari, K.; Giaouris, D. Switched Auto-Regressive Neural Control (S-ANC) for Energy Management of Hybrid Microgrids. Appl. Sci. 2023, 13, 11744. [Google Scholar] [CrossRef]
Shih, S.Y.; Sun, F.K.; Lee, H. Temporal pattern attention for multivariate time series forecasting. Mach. Learn. 2019, 108, 1421–1441. [Google Scholar] [CrossRef]
Hatami, N.; Gavet, Y.; Debayle, J. Classification of time-series images using deep convolutional neural networks. In Tenth International Conference on Machine Vision (ICMV 2017); SPIE: Bellingham, DC, USA, 2018; Volume 10696, pp. 242–249. [Google Scholar]
Pelletier, C.; Webb, G.I.; Petitjean, F. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens. 2019, 11, 523. [Google Scholar] [CrossRef]
Liu, L.; Si, Y.W. 1D convolutional neural networks for chart pattern classification in financial time series. J. Supercomput. 2022, 78, 14191–14214. [Google Scholar] [CrossRef]
Zhao, W.; Wang, L.; Mirjalili, S. Artificial hummingbird algorithm: A new bio-inspired optimizer with its engineering applications. Comput. Methods Appl. Mech. Eng. 2022, 388, 114194. [Google Scholar] [CrossRef]
Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
Dehghani, M.; Hubálovský, Š.; Trojovský, P. Northern goshawk optimization: A new swarm-based algorithm for solving optimization problems. IEEE Access 2021, 9, 162059–162080. [Google Scholar] [CrossRef]
Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]

Figure 1. LSTM structure.

Figure 2. TPA structure.

Figure 3. CNN structure.

Figure 4. Algorithmic optimization process.

Figure 5. Diagram of model structure.

Figure 6. Visualization of processed data.

Figure 7. DA-AHA optimizes CLT.

Figure 8. Visualization of model performance metrics.

Figure 9. DA-AHA optimization process.

Figure 10. DA-AHA -CLT testing process.

Figure 11. Results of different models.

Figure 12. Comparison of different models.

Figure 13. Comparison of different algorithms.

Table 1. Summary of strengths and limitations of different methods.

Method Type	Examples	Strengths	Limitations	Proposed Model’s Innovations
Traditional Models	ARIMA, SARIMA	Good for linear trends and stationary data.	Poor at capturing nonlinear and multivariate dependencies; struggles with long-term dependencies.	Combines deep learning and optimization to handle complexity.
Deep Learning Models	CNN, LSTM, CNN-LSTM	Effective in extracting features from time series and modeling temporal relationships.	Sensitive to hyperparameter tuning; a single model struggles to capture both local and global time-series characteristics.	Integrates CNN, LSTM, and TPA with optimized hyperparameters via DA-AHA.
Proposed Model	DA-AHA-CNN-LSTM-TPA	Captures local and global features using CNN, LSTM, and TPA; dynamic hyperparameter optimization improves accuracy and robustness.	Relies on high-quality data and computational resources.	Combines multiple advantages and optimizations for superior prediction performance.

Table 2. Data sheet.

Voltage (kV)	Ambient Temperature (°C)	Wire Type	Tower Height (m)	Wind Speed (m/s)	Wind Direction (°)	Line Temperature (°C)
500	−15.1	steel-cored aluminum stranded wire	30	4.84	338.18	81.55
500	−12.9	steel-cored aluminum stranded wire	30	6.45	340.38	84.13
500	−12.7	steel-cored aluminum stranded wire	30	6.33	338.13	84.29
500	−11.5	steel-cored aluminum stranded wire	30	4.89	320.68	85.77
500	−10.5	steel-cored aluminum stranded wire	30	6.05	312.18	87.26

Table 3. Information about the running environment.

Feature	Value
Training data (80%)	00:00 1 January 2024 to 23:50 24 September 2024
Testing data (20%)	00:38 25 September 2024 to 00:00 1 December 2024
Vector length	10
Sampling rate	1h
Numerical environment	Python 3.9.3
Libraries	Numpy, TensorFlow, Pandas, Matplotlib, Keras, cuda
Machine Configuration	AMD Ryzen^TM 9 5900HX @ 3.30 GHz, 16 Threads, NVIDIA GeForce RTX 4070 Ti, 12 GB GDDR6X, Operating System: 64-bit Windows

Table 4. CLT mode l parameters.

Parameters		Details
	Filter	32
	Kernel size	2
Conv1D	Activation	ReLu
	Kernel regularizer	L2 (strength 0.1)
MaxPooling1D	Pool size	2
Dropout	Dropout rate	0.3
LSTM	Units1	10
	Units2	10
Attention	Units	20
	Unites	10
Dense1	Activation	ReLu
Dense2	Unites	1

Table 5. Scores for different models.

	$R^{2}$	RMSE	MAE	$M e d A E$
CLT	0.878	0.92	0.87	0.71
LSTM-TPA	0.798	1.07	0.94	0.81
CNN-LSTM	0.773	1.13	0.97	0.78
LSTM	0.704	1.59	1.32	0.91
ARIMA	0.853	0.10	0.90	0.84
SARIMA	0.859	0.98	0.91	0.83

Table 6. DA-AHA parameter settings and optimization ranges.

	Parameters	Details
	Pop	3
	MaxIter	40
	Dim	4
	LSTM units1	[32, 128]
Best parameters	LSTM regularizer	[0.001, 0.01]
	LSTM units2	[32, 64]
	Learning rate	[0.001, 0.01]

Table 7. Scores for different models and algorithms.

	$R^{2}$	RMSE	MAE	$M e d A E$
DA-AHA-CLT	0.987	0.023	0.018	0.011
WOA-CLT	0.964	0.047	0.041	0.023
NGO-CLT	0.971	0.053	0.048	0.035
PSO-CLT	0.962	0.078	0.064	0.041
CLT	0.878	0.921	0.871	0.713
LSTM-TPA	0.798	1.075	0.943	0.812
CNN-LSTM	0.773	1.134	0.972	0.784
LSTM	0.704	1.597	1.324	0.915

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ji, X.; Lu, C.; Xie, B.; Han, H.; Li, M. Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction. Electronics 2025, 14, 403. https://doi.org/10.3390/electronics14030403

AMA Style

Ji X, Lu C, Xie B, Han H, Li M. Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction. Electronics. 2025; 14(3):403. https://doi.org/10.3390/electronics14030403

Chicago/Turabian Style

Ji, Xiu, Chengxiang Lu, Beimin Xie, Huanhuan Han, and Mingge Li. 2025. "Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction" Electronics 14, no. 3: 403. https://doi.org/10.3390/electronics14030403

APA Style

Ji, X., Lu, C., Xie, B., Han, H., & Li, M. (2025). Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction. Electronics, 14(3), 403. https://doi.org/10.3390/electronics14030403

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Adaptive Artificial Hummingbird Algorithm-Enhanced Deep Learning Framework for Accurate Transmission Line Temperature Prediction

Abstract

1. Introduction

2. Modules and Algorithms

2.1. Implementation of the LSTM Method in the Proposed Method

2.2. Temporal Pattern Attention Mechanisms

2.3. CNN

2.4. Dynamic Adaptive Artificial Hummingbird Algorithm (DA-AHA)

2.4.1. Initialization Phase

2.4.2. Dynamic Step Size and Inertia Weight Adjustment

2.4.3. Foraging Strategy Selection

2.4.4. Greedy Local Search Mechanism

2.4.5. Elite Retention Mechanisms

2.4.6. Grouped Parallel Search Mechanisms

2.4.7. Termination Conditions

2.5. Whale Optimization Algorithm

2.6. Northern Goshawk Optimization

2.7. Particle Swarm Optimization

2.8. Transmission Line Temperature Prediction Model

3. Experimental Section

3.1. Data Processing

3.2. Target Functions and Evaluation Indicators

3.3. Ablation Experiment

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI