1. Introduction
Lithium-ion batteries (LIBs) have become the mainstream solution in modern energy storage [
1] due to their advantages of a fast charge/discharge rate, high power density, no memory effect, and long cycle life. Therefore, this battery system is widely used in consumer electronics and new energy vehicles (NEVs). To ensure the safe operation of these batteries and extend their service life, Li-ion batteries must be equipped with a battery management system (BMS) for performance optimization and safety control [
2]. The core function of the BMS is to strictly limit the battery operating state to a safe threshold range by continuously monitoring key parameters such as state of health (SOH) and state of charge (SOC). Among them, the accurate estimation of SOH, as the core technical index of BMSs, is of decisive significance for ensuring the driving experience and safety of EVs [
3]. SOH is a quantitative index that characterizes the aging degree and operational reliability of batteries, which is usually defined as the ratio of the current maximum capacity to the rated initial capacity in academic research.
Accurate methods for estimating the state of health (SOH) of batteries can be categorized into the following three main groups: direct measurement methods, model-based methods, and data-driven methods [
4]. Direct measurement methods evaluate SOH by detecting parameters directly related to battery degradation, such as open-circuit voltage (OCV), internal resistance, and impedance [
5]. Although these methods are simple to compute and adaptable, they have limitations in practical online applications due to stringent hardware requirements [
6]. Among model-based approaches, electrochemical models (e.g., the Pseudo-Two-Dimensional (P2D) model) can accurately capture internal physicochemical processes. However, their computational complexity and parameterization challenges restrict real-world deployment [
7,
8]. Equivalent circuit models (ECMs) offer simpler structures [
9], yet parameter identification remains problematic under dynamic operating conditions [
10]. Furthermore, the strong nonlinearity of battery degradation [
11] necessitates scenario-specific model reconfiguration, demanding frequent retuning for different battery chemistries [
12].
In contrast, data-driven methods circumvent complex mechanistic modeling by leveraging machine learning to extract degradation features directly from operational data. Recent advances have significantly enhanced their adaptability for engineering applications: Bidirectional Long Short-Term Memory (BiLSTM) networks with self-attention mechanisms (BiLSTM-SA) enable joint SOC–SOH estimation through temporal dependency capture [
13], while incremental energy analysis coupled with BiLSTM improves robustness against capacity regeneration effects [
14]. Signal Temporal Logic (STL) further offers a novel framework for SOH recognition by quantitatively parsing discharge curve patterns [
15]. These innovations address critical limitations of traditional approaches, particularly in handling nonlinear aging dynamics under real-world operating variability [
16].
In recent years, data-driven methods [
17] have garnered significant academic and industrial attention due to their flexibility and versatility. These models rely primarily on operational data (current, voltage, and temperature) to establish correlations between inputs and battery health status, circumventing the explicit mechanistic modeling of aging processes. Common implementations include Gaussian process regression [
18], support vector machines [
19], extreme learning machines [
20], and gray correlation analysis [
21]. However, while classical time series algorithms like ARIMA/SARIMA and VAR offer interpretability for stationary data [
22], they struggle with the nonstationary dynamics of battery degradation—particularly under varying operational loads and temperature cycles where internal parameters exhibit strong time-varying dependencies [
22].
Machine learning techniques demonstrate a superior capability in this domain: their inherent adaptability to complex nonlinear patterns enables robust feature extraction from multidimensional operational data, significantly outperforming conventional statistical models in handling capacity regeneration phenomena and noise-corrupted measurements. Recent innovations further highlight this advantage—XGBoost–ARIMA joint optimization frameworks effectively mitigate ARIMA’s noise sensitivity through machine learning-based residual correction [
23], while auto-regressive integrated moving average with exogenous variables (ARIMAX) requires explicit parameter variation modeling to maintain accuracy [
22]. Such limitations underscore machine learning’s critical role in achieving generalizable SOH prediction without manual system reconfiguration.
In the study of battery health state prediction based on CNN-GRU-Attention modeling, the optimal selection of health indicators (HIs) and model architecture refinement are critical for accuracy enhancement. While traditional signal processing techniques (e.g., wavelet transform [
24] and Fourier transform [
25]) offer lower computational costs, they require the manual construction of basis functions, which may fail to capture complex nonlinear degradation patterns—particularly under variable operating conditions where sudden capacity drop and internal state shifts occur [
26]. However, recurrent architectures (e.g., GRU/LSTM) exhibit constrained capacity in modeling long-range temporal dependencies within battery aging data [
27]. This motivates the adoption of transformer-based techniques: their self-attention mechanisms inherently capture global context across entire charge–discharge cycles [
28], while hybrid frameworks (e.g., Transformer–GRU parallel architectures [
29]) synergistically integrate local feature extraction [
30] with inter-cycle dependency modeling.
Previous studies on CNN-BIGRU-Attention optimization algorithms remain limited, particularly in addressing issues such as poor initialization stability and inefficient hyperparameter tuning. The initialization of CNN and GRU weights significantly impacts model convergence and performance, yet traditional random initialization often leads to unstable training. Existing data-driven models for lithium battery health prediction [
31,
32] also suffer from gradient explosion risks, noise-induced prediction instability, and insufficient multi-source feature fusion. To overcome these limitations, this study introduces the Hiking Optimization Algorithm (HOA), which enhances hyperparameter optimization efficiency through dynamically adjusted convergence factors and variance probabilities. The proposed framework employs a CNN with dilated convolutions to capture multi-granularity degradation features, a BIGRU layer to model temporal dependencies, and a channel-time dual-domain attention mechanism to dynamically weight feature contributions—overcoming the shortcomings of traditional single-domain attention.
This work aims to improve the stability of modeling predictions and reduce the computational burden of modeling models. Based on a previous study that used the CNN-BIGRU model [
33] to estimate the SOH aging of lithium batteries, this paper introduces the HOA optimization algorithm to optimize the hyperparameters and attention mechanism of the baseline model, thereby improving the model’s structure and enhancing its prediction accuracy and fitting ability for SOH aging. The main contributions are summarized as follows.
(1) First, the HOA is introduced to optimize the CNN-BIGRU model, which significantly improves the convergence speed and training stability of the model, as well as enhancing the prediction performance of the network.
(2) Second, in terms of feature engineering, eight basic feature factors are extracted from the NASA dataset for characterizing SOH, and a multidimensional feature space is constructed to provide input data for the model with more characterization capabilities.
(3) Finally, Attention is integrated into the model, which effectively solves the limitations of the original model in terms of feature redundancy, long time series dependency, and noise sensitivity through dynamic weight assignment and explicit dependency modeling, and, at the same time, improves the interpretability of the model.
For clarity, the key abbreviations and mathematical variables employed throughout this study are systematically defined, as follows, in
Table 1:
The workflow of this study is shown in
Figure 1. First, four time-domain feature factors—CVRT, CVFT, CCDT, and CCCT—and the second derivative feature of the IC curve are extracted from the NASA battery dataset as model input features for predicting battery health status. In terms of data partitioning, a stratified sampling method is used to divide the dataset into a training set and a test set in a 7:3 ratio. After constructing the SOH prediction model using the training set, the model’s prediction performance on the test set is systematically evaluated using accuracy metrics such as root mean square error, mean absolute error, and coefficient of determination. The full text of the work is organized in
Figure 1 below.
4. Experimental Validation and Model Comparison
4.1. Experimental Setup
During the experiment, 70% of the cycle data is used for model training, and the remaining 30% of the cycle data is used for model testing. The lithium battery cycling data used for the experiments are obtained from NASA lithium battery cycling dataset and Maryland University dataset. The hardware and software configurations used for the experiments are 13th Gen Intel(R) Core(TM) i5-13400F, 24G RAM, NVIDIA GeForce RTX 4060Ti, and Matlab R2022b.
The detailed structure of the HOA-CNN-BIGRU-ATTENTION model is presented in
Table 6.
The parameter settings of the Hiking Optimization Algorithm have a decisive impact on the performance of the CNN-BIGRU-ATTENTION model, and reasonable parameter configurations can not only optimize the convergence speed and training stability of the model, but also significantly improve its generalization ability on the test set. The choice of hyperparameters directly affects the synergistic efficiency of the modeling components (convolutional layer, bidirectional gated loop unit, and attention mechanism), including the local sensitivity of feature extraction, the modeling ability of temporal dependence, and the weight allocation of key information. The value range of each parameter, its theoretical basis, and its mechanism of action on model performance are listed in detail in
Table 7 to provide reproducible configuration benchmarks for subsequent experiments.
4.2. Comparison of Precision
Figure 5 compares the prediction performance of the CNN-BIGRU-ATTENTION model optimized using the Hiking Optimization Algorithm, the sparrow optimization algorithm, and the CNN-BIGRU-ATTENTION model without the optimization algorithm for the four charge/discharge cycles of the NASA battery dataset, B0005, B0006, B0007, and B0018, with the root mean square error as an evaluation metric. The results show that the RMSEs of the HOA and SSA are significantly lower than those of the NONE benchmark method in all cycles, indicating that both optimization algorithms can effectively improve prediction accuracy, among which the HOA performs optimally in the B0005 to B0018 cycles, with RMSE reductions of 0.416%, 0.658%, 0.081%, and 0.137% compared to SSA, respectively. Notably, the RMSE of the B0006 cycle is generally higher than that of the other cycles, which is speculated to be related to the nonlinear decline due to the accelerated electrolyte decomposition in this batch of batteries. In addition, the RMSE variance of CNN-BIGRU-ATTENTION (NONE) without using the optimization algorithm method is significantly larger than that of this model optimized by HOA and SSA, which further validates the advantage of the optimization algorithm in enhancing prediction stability. Taken together, these results show that choosing an appropriate optimization algorithm based on specific battery cycle characteristics can effectively improve prediction accuracy.
Figure 6,
Figure 7,
Figure 8 and
Figure 9 compare the degree of fit of the CNN-BIGRU-ATTENTION model optimized using the Hiking Optimization Algorithm, the sparrow optimization algorithm, and without the optimization algorithm to NASA’s four battery charge/discharge cycling datasets, B0005, B0006, B0007, and B0018. By intercepting the fitted segments of the test set, it was found that the Hiking Optimization Algorithm (HOA) had the highest degree of characterization of the SOH aging phenomenon, the best fit, and the lowest root-mean-square error.
In addition, this study also introduced the CNN-BILSTM-ATTENTION model, which had undergone hyperparameter optimization by HOA for comparative experiments. Through multiple repeated experiments, it was proven that under the same algorithm parameter settings, the HOA-CNN-BIGRU-ATTENTION model proposed in this study had the best prediction and fitting effect on the NASA aging dataset.
The experimental results show that the HOA exhibits significant performance advantages in SOH prediction for the four battery samples B0005–B0018. Compared with the unoptimized (NONE) and SSA optimization methods, the HOA-optimized modeling prediction curves (red) are closer to the real SOH values (black) throughout the cycling cycle, and the fitting accuracy remains high, especially at the later stage (Cycle > 100).
From the above four sets of fitting comparison line graphs, it is found that the CNN-BIGRU-ATTENTION model with the HOA introduced is better in predicting the SOH of lithium batteries, and the rest of the specific parameters of the judgment indexes are shown in
Table 8 below.
The experimental data show that the RMSE values of the HOA-optimized CNN-BIGRU-ATTENTION model on the four cell samples (B0005, B0006, B0007, and B0018) are 0.00689, 0.01847, 0.00857, and 0.00805, respectively, which are significantly lower than those of the comparison models SSA-CNN-BIGRU-ATTENTION and CNN-BIGRU-ATTENTION, e.g., in sample B0005, the HOA-optimized model’s RMSE is reduced by 38.6% (0.01105→0.00689) compared to the SSA-optimized model and 42.7% (0.01202→0.00689) compared to the unoptimized model. This result verifies that the HOA effectively suppresses the prediction bias and improves the generalization ability of the model through parameter optimization.
The analysis of the R2 metrics reveals that the goodness-of-fit of the HOA optimization model maintains a leading position in all four data sets (0.969, 0.922, 0.948, and 0.916), with an average improvement of 4.3% compared to the SSA optimization model and an average improvement of 6.8% compared to the base model. Especially in the B0005 sample, R2 reaches 0.969, which is close to the theoretical optimal value of 1, indicating that the model can explain 96.9% of the variation in the dependent variable. This systematic advantage stems from the HOA’s global optimization of the modeling weights, which more accurately captures the nonlinear characteristics of the battery degradation process.
The comparison of RPD metrics further supports the robustness of the HOA-optimized model. The RPD values of the model for the four samples range from 4.02 to 7.55, values that are much higher than those of the other two methods (up to 4.74 for the SSA optimization model and up to 6.38 for the base model). According to the criterion that RPD > 2.5 indicates a good model reliability, the HOA-optimized model reaches the “excellent” level (RPD > 4) in all test cases, especially in sample B0005, where the RPD is as high as 7.55, which indicates that its prediction results have a high stability and practical value. This feature is crucial for engineering applications in battery health management.
Figure 10,
Figure 11,
Figure 12 and
Figure 13 compare the fitting degrees of the HOA-CNN-BIGRU-ATTENTION, CNN-KAN, and BIGRU models with the four battery charge–discharge cycle datasets in the University of Maryland dataset. By extracting the fitting segments of the test set, it is found that the HOA-CNN-BIGRU-ATTENTION model has the highest degree of representation for the SOH aging phenomenon, the best fitting effect, the lowest root mean square error, and the strongest generalization ability.
In
Table 9 below, by presenting a comparison of the prediction errors and curve fitting degrees of the HOA-CNN-BIGRU-ATTENTION model with the more recent and advanced CNN-KAN and BIGRU models for the SOH prediction of four sets of charge–discharge data from the University of Maryland, a more precise and intuitive demonstration of the generalization ability and prediction accuracy of this model is provided.
Through quantitative analysis of the charging and discharging data of four groups of batteries (CS2-35 to CS2-38) from the University of Maryland, the HOA-CNN-BIGRU-ATTENTION model demonstrated significant predictive performance advantages. This model achieved the lowest RMSE values (0.00127 to 0.00589) across all tested batteries, which were, on average, 34.7% and 76.2% lower than those of the CNN-KAN and BIGRU models, respectively. Particularly, its performance on the CS2-35 battery was particularly outstanding, with the RMSE reduced by 87.4% compared to BIGRU. Meanwhile, its R2 values were generally close to or exceeded 0.99 (0.98113 to 0.99922), significantly higher than those of the comparison models, especially reaching 0.99922 and 0.99826 on the CS2-35 and CS2-37 batteries, respectively, with improvements of 0.5% to 2.7% and 1.3% to 4.8% compared to CNN-KAN and BIGRU, respectively. Additionally, the RMSE fluctuation range of this model across the four groups of batteries was the smallest (0.00462), much lower than that of CNN-KAN (0.00549) and BIGRU (0.01123), demonstrating a stronger stability and generalization ability. These results fully prove that the HOA-CNN-BIGRU-ATTENTION model, by integrating the attention mechanism with a hybrid architecture, has statistically significant advantages in predictive accuracy and consistency and is suitable for high-precision battery health status prediction tasks.
Through the simulation pre-tests of eight battery groups in the above two data sets, the universality, generalization ability, and innovation of the model proposed in this study are maximally demonstrated. Rigorous simulation pre-tests conducted across eight battery groups within the aforementioned datasets robustly demonstrate the universality, generalization capability, and innovative nature of the proposed model. This foundational validation underscores the model’s potential for integration with advanced artificial intelligence (AI) and machine learning (ML) paradigms. The application of AI/ML techniques is pivotal in contemporary electric vehicle research and central to advancing critical areas, including battery management system (BMS) enhancement, precise prognostics of component health (encompassing battery aging), and overall vehicle performance optimization.
4.3. Mathematical Statistical Analysis
In order to provide evidence that the HOA-CNN-BIGRU-ATTENTION model outperforms the other three groups of models in terms of mathematical and statistical performance, this paper verifies the statistical significance of the performance differences among the four model groups in the aforementioned comparative experiments. Using the non-parametric test framework recommended by Derrac, J, et al. [
37,
38], the
RMSE and
R2 of the four groups of battery data from NASA were independently ranked (with the best being 1 and the worst being 4). Then, the average ranking of the four models was calculated and the Friedman test was performed.
From the above
Table 8, it can be seen that for the SOH aging data of the three groups of batteries (B0005–B0007), the
RMSE and
R2 indicators of the four models were ranked and averaged. The average ranking of HOA-CNN-BIGTU-ATTENTION was first, that of HOA-CNN-BILSTM-ATTENTION was second, that of SSA-CNN-BIGRU-ATTENTION was third, and that of CNN-BIGRU-ATTENTION was fourth.
We calculated the Friedman statistic in B0005 using Formula (16). For B0006 and B0007, the same procedure was applied. In the formula,
k represents the number of comparison algorithms and N is the product of the number of battery groups and the number of indicators. Rj calculates the sum of the squares of the average rankings of the four comparison algorithms.
It can be determined that in B0005–B0007, the value of
XF2 is 24 for all cases. Then, based on the Python 3.13.5 code, as shown in the pseudo-code in
Table 10 below, the
p-value is calculated to be <0.0001 (much lower than the critical value of 11.4) [
37,
38], indicating that the proposed HOA-CNN-BIGRU-ATTENTION model ranks first in all dataset combinations (the same applies to the B0018 battery group and the University of Maryland dataset). The Friedman test reaches the theoretical maximum
X2F = 24 (df = 3 = (k−1)), with an exact
p < 0.0001. Once again, this verifies the significant superiority of the algorithm proposed in this paper in terms of mathematical statistics. This situation is the most powerful evidence in statistical tests, far exceeding the conventional significance level requirements.
4.4. Transfer Learning Prediction
In addition, this study takes into account that in actual vehicle operation conditions, in addition to the precise prediction of the aging of individual battery groups, the aging prediction between battery groups is also particularly important for the stable operation of the vehicle. Therefore, this study also selects four sets of battery group data from the University of Maryland dataset, namely CS2-35, CS2-36, CS2-37, and CS2-38, and uses the complete aging cycle data of the other three groups of batteries as the training set to predict the aging cycle of the remaining single battery. This not only provides a new innovation for the aging prediction of individual battery groups in vehicle operation, but also proposes a new idea for mutual aging prediction between different battery groups in vehicle operation. This truly proves the innovation and scientific nature of this study, as well as the generalization and universality of the model. Using the battery aging data of groups CS2-36, 37, and 38 from the public dataset of the University of Maryland as the training set input for this model, a line comparison and fitting graph for predicting the aging law of the SOH of the CS2-35 group battery from the University of Maryland dataset is shown in
Figure 14 below.
When using the CS2-36, 37, and 38 data as the training set input for the proposed model in this study to predict the aging pattern of the SOH of CS2-35, the RSME of the predicted results was 0.06235. This transfer learning experiment proved the generalization ability of this model among different battery groups. Due to time constraints and the lack of real vehicle data, this paper temporarily selected the dataset from the University of Maryland as the object for transfer learning to verify the generalization ability of this model among different battery groups for future real vehicle data. It is believed that in the future, as our battery aging experiments are carried out in the laboratory and test data become more widespread, alongside continuous optimization and improvement of the proposed model, this model will be widely applied.
Overall, this paper demonstrates good predictive capabilities for the SOH aging of single battery packs in both the NASA dataset and the dataset from the University of Maryland. This proves the generalization and scientific nature of the model proposed in this paper. To address the lack of real vehicle data, this paper also adds experiments on the migration prediction of battery packs. However, due to time constraints and the large amount, complexity, and high difficulty of transfer learning, the accuracy and fitting degree of this model in transfer learning still need to be further improved to cope with SOH prediction tasks for battery packs in future real vehicle operations. Nevertheless, the series of existing experiments also confirm that the logic of the model proposed in this paper is scientifically reasonable and in line with the actual situation. Continuously improving and optimizing the model results and conducting a series of real vehicle battery SOH aging experiments for training and improving the model proposed in this paper are also the research focus of our next work.
5. Conclusions
In this paper, an improved CNN-BIGRU-ATTENTION model based on the Hiking Optimization Algorithm (HOA) is proposed and validated on the NASA dataset and the Maryland University SOH (state of health) dataset. The main findings of this method are as follows:
In this study, a CNN-BIGRU modeling optimization method based on the Hiker Optimization Algorithm (HOA) is proposed, which significantly improves the convergence speed, training stability, and prediction performance of the model through the adaptive parameter tuning mechanism of HOA, thus providing a new solution for the parameter optimization of complex models.
In this study, eight basic feature factors are extracted from NASA public datasets and the Maryland University dataset, including four key feature factors for characterizing the state of health (SOH) of lithium-ion batteries, which enhances SOH characterization in a multi-dimensional and highly discriminative manner. These optimized feature sets are used as input data for the HOA-CNN-BIGRU-Attention model, aiming to improve the model’s prediction accuracy of true battery capacity. This feature engineering approach not only enriches the physical meaning of the input data by combining the base features with the higher-order information of the IC curves, but also significantly enhances the model’s ability to characterize battery degradation patterns, thus providing a more reliable basis for the accurate estimation of SOH.
In this study, the Attention mechanism is introduced into the CNN-BIGRU modeling architecture, which effectively solves the three key problems of the original model, as follows, by establishing a dynamic weight allocation strategy and explicit dependency modeling: (1) redundant information interference in the high-dimensional feature space; (2) insufficient modeling of long time series dependencies; and (3) the sensitivity of the input noise is too high. The experimental results show that the improvement not only significantly improves the model performance (specific quantitative indexes need to be added), but also enhances the transparency and interpretability of the model decision-making process through visual analysis of the attention weights, providing a more robust deep learning solution for time series signal processing tasks.
Looking ahead, future research should focus on enhancing the interpretability and robustness of hybrid deep learning models like HOA-CNN-BiGRU-Attention through explainable AI (XAI) techniques, while exploring advanced metaheuristic optimizers (e.g., quantum-inspired or transfer learning-based algorithms) for high-dimensional hyperparameter spaces. Investigations into multi-modal data fusion incorporating thermal, electrochemical impedance spectroscopy (EIS), and mechanistic degradation models could further improve SOH prediction generalizability across diverse battery chemistries and operating conditions. Additionally, developing lightweight variants optimized for edge-computing deployment in battery management systems represents a critical direction for real-world industrial adoption.
Overall, the present method achieves the accurate estimation of the SOH of Li-ion batteries in different cases. In the next work, the model will be simplified to reduce the consumption of computational resources while guaranteeing the prediction accuracy.